{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Training\n",
    "\n",
    "This notebook contains all the commands for training/fine-tuning a suit of pair classifiers used in Fig. 2 of the following paper:\n",
    "\n",
    "```\n",
    "Hosseini, Nanni and Coll Ardanuy (2020), DeezyMatch: A Flexible Deep Learning Approach to Fuzzy String Matching, EMNLP: System Demonstrations.\n",
    "```\n",
    "\n",
    "Refer to the `Fig2_EMNLP_inference` notebook where we use these models for inference.\n",
    "\n",
    "---\n",
    "\n",
    "In this notebook:\n",
    "\n",
    "* skyline1: trained on *OCR* dataset\n",
    "* skyline2: trained on *WG:en+OCR* dataset\n",
    "* baseline: trained on *WG:en* dataset\n",
    "\n",
    "---\n",
    "\n",
    "* model A: both embedding and recurrent units are frozen (i.e., their parameters are not updated during fine-tuning).\n",
    "* model B: only the embedding layer is frozen. \n",
    "\n",
    "---\n",
    "\n",
    "To show the impact of fine-tuning and choice of architecture on the model performance, we trained various models starting with the baseline model and included more training instances from the training set of *OCR*.\n",
    "\n",
    "The performance of these models is then assessed on the *OCR* test set. \n",
    "\n",
    "Refer to the paper for more information."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## skyline1"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "from DeezyMatch import train as dm_train"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 09:40:33\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mread input file: ./inputs/input_dfm.yaml\u001b[0m\n",
      "\u001b[92m2020-09-10 09:40:33\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32mpytorch will use: cuda:1\u001b[0m\n",
      "\u001b[92m2020-09-10 09:40:33\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mread CSV file: ./dataset/ocr_trainval.txt\u001b[0m\n",
      "\u001b[92m2020-09-10 09:40:33\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32mnumber of labels, True: 42301 and False: 42302\u001b[0m\n",
      "\u001b[92m2020-09-10 09:40:33\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mSplitting the Dataset\u001b[0m\n",
      "\u001b[92m2020-09-10 09:40:33\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mfinish splitting the Dataset. User time: 0.07219839096069336\u001b[0m\n",
      "\u001b[92m2020-09-10 09:40:33\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msplits are as follow:\n",
      "train    59221\n",
      "val      25380\n",
      "test         2\n",
      "Name: split, dtype: int64\u001b[0m\n",
      "\u001b[92m2020-09-10 09:40:33\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mstart creating a lookup table and convert characters to indices\u001b[0m\n",
      "\u001b[92m2020-09-10 09:40:33\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32m-- create vocabulary\u001b[0m\n",
      "\u001b[92m2020-09-10 09:40:35\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32m-- convert tokens to indices\u001b[0m\n",
      "\u001b[92m2020-09-10 09:40:35\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32m-- create a lookup table for tokens\u001b[0m\n",
      "\u001b[92m2020-09-10 09:40:35\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32m-- read list of characters from ./inputs/characters_v001.vocab\u001b[0m\n",
      "\u001b[92m2020-09-10 09:40:35\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32m-- Length of vocabulary: 7542\u001b[0m\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "                                                                     \r"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n",
      "\n",
      "\u001b[92m2020-09-10 09:40:36\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m******************************\u001b[0m\n",
      "\u001b[92m2020-09-10 09:40:36\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m**** (Bi-directional) GRU ****\u001b[0m\n",
      "\u001b[92m2020-09-10 09:40:36\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m******************************\u001b[0m\n",
      "\u001b[92m2020-09-10 09:40:36\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mread inputs\u001b[0m\n",
      "\u001b[92m2020-09-10 09:40:36\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mcreate a two_parallel_rnns model\u001b[0m\n",
      "\u001b[92m2020-09-10 09:40:38\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32mstart fitting parameters\u001b[0m\n",
      "\u001b[92m2020-09-10 09:40:38\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mNumber of batches: 926\u001b[0m\n",
      "\u001b[92m2020-09-10 09:40:38\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mNumber of epochs: 10\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "a92abf6e65d24084afe094f4eca73d04",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=10.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=926.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n",
      "\n",
      "====================\n",
      "Total number of params: 684843\n",
      "\n",
      "two_parallel_rnns (\n",
      "  (emb): Embedding(7542, 60), weights=((7542, 60),), parameters=452520\n",
      "  (rnn_1): GRU(60, 60, num_layers=2, dropout=0.01, bidirectional=True), weights=((180, 60), (180, 60), (180,), (180,), (180, 60), (180, 60), (180,), (180,), (180, 120), (180, 60), (180,), (180,), (180, 120), (180, 60), (180,), (180,)), parameters=109440\n",
      "  (attn_step1): Linear(in_features=120, out_features=60, bias=True), weights=((60, 120), (60,)), parameters=7260\n",
      "  (attn_step2): Linear(in_features=60, out_features=1, bias=True), weights=((1, 60), (1,)), parameters=61\n",
      "  (fc1): Linear(in_features=960, out_features=120, bias=True), weights=((120, 960), (120,)), parameters=115320\n",
      "  (fc2): Linear(in_features=120, out_features=2, bias=True), weights=((2, 120), (2,)), parameters=242\n",
      ")\n",
      "====================\n",
      "\n",
      "\n",
      "\u001b[92m2020-09-10 09:41:35\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_09:41:35 -- Epoch: 1/10; Train; loss: 0.293; acc: 0.875; precision: 0.849, recall: 0.912, macrof1: 0.875, weightedf1: 0.875\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 09:41:44\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_09:41:44 -- Epoch: 1/10; Valid; loss: 0.177; acc: 0.935; precision: 0.919, recall: 0.955, macrof1: 0.935, weightedf1: 0.935\u001b[0m\n",
      "\u001b[92m2020-09-10 09:41:44\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=926.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 09:42:40\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_09:42:40 -- Epoch: 2/10; Train; loss: 0.154; acc: 0.946; precision: 0.932, recall: 0.962, macrof1: 0.946, weightedf1: 0.946\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 09:42:49\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_09:42:49 -- Epoch: 2/10; Valid; loss: 0.134; acc: 0.951; precision: 0.931, recall: 0.974, macrof1: 0.951, weightedf1: 0.951\u001b[0m\n",
      "\u001b[92m2020-09-10 09:42:49\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=926.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 09:43:45\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_09:43:45 -- Epoch: 3/10; Train; loss: 0.118; acc: 0.959; precision: 0.949, recall: 0.970, macrof1: 0.959, weightedf1: 0.959\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 09:43:54\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_09:43:54 -- Epoch: 3/10; Valid; loss: 0.130; acc: 0.955; precision: 0.950, recall: 0.962, macrof1: 0.955, weightedf1: 0.955\u001b[0m\n",
      "\u001b[92m2020-09-10 09:43:54\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=926.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 09:44:51\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_09:44:51 -- Epoch: 4/10; Train; loss: 0.095; acc: 0.967; precision: 0.959, recall: 0.976, macrof1: 0.967, weightedf1: 0.967\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 09:45:00\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_09:45:00 -- Epoch: 4/10; Valid; loss: 0.133; acc: 0.956; precision: 0.941, recall: 0.972, macrof1: 0.956, weightedf1: 0.956\u001b[0m\n",
      "\u001b[92m2020-09-10 09:45:00\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model (early stopped) with least valid loss (checkpoint: 3) at ./models/ocr_001/ocr_001.model\u001b[0m\n",
      "\u001b[92m2020-09-10 09:45:00\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n",
      "\u001b[92m2020-09-10 09:45:00\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mEarly stopping at epoch: 4, selected epoch: 3\u001b[0m\n",
      "\n",
      "\n",
      "\n",
      "\n",
      "====================\n",
      "User time: 263.7606\n",
      "====================\n"
     ]
    }
   ],
   "source": [
    "# train a new model\n",
    "dm_train(input_file_path=\"./inputs/input_dfm.yaml\", \n",
    "         dataset_path=\"./dataset/ocr_trainval.txt\", \n",
    "         model_name=\"ocr_001\")\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## skyline1b"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 75,
   "metadata": {
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 22:12:40\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mread input file: ./inputs/input_dfm_b.yaml\u001b[0m\n",
      "\u001b[92m2020-09-10 22:12:40\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32mpytorch will use: cuda:1\u001b[0m\n",
      "\u001b[92m2020-09-10 22:12:40\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mread CSV file: ./dataset/ocr_trainval.txt\u001b[0m\n",
      "\u001b[92m2020-09-10 22:12:40\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32mnumber of labels, True: 42301 and False: 42302\u001b[0m\n",
      "\u001b[92m2020-09-10 22:12:40\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mSplitting the Dataset\u001b[0m\n",
      "\u001b[92m2020-09-10 22:12:40\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mfinish splitting the Dataset. User time: 0.08831453323364258\u001b[0m\n",
      "\u001b[92m2020-09-10 22:12:40\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msplits are as follow:\n",
      "train    83755\n",
      "val        846\n",
      "test         2\n",
      "Name: split, dtype: int64\u001b[0m\n",
      "\u001b[92m2020-09-10 22:12:40\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mstart creating a lookup table and convert characters to indices\u001b[0m\n",
      "\u001b[92m2020-09-10 22:12:40\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32m-- create vocabulary\u001b[0m\n",
      "\u001b[92m2020-09-10 22:12:42\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32m-- convert tokens to indices\u001b[0m\n",
      "\u001b[92m2020-09-10 22:12:42\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32m-- create a lookup table for tokens\u001b[0m\n",
      "\u001b[92m2020-09-10 22:12:42\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32m-- read list of characters from ./inputs/characters_v001.vocab\u001b[0m\n",
      "\u001b[92m2020-09-10 22:12:42\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32m-- Length of vocabulary: 7542\u001b[0m\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "                                                                     "
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n",
      "\n",
      "\u001b[92m2020-09-10 22:12:43\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m******************************\u001b[0m\n",
      "\u001b[92m2020-09-10 22:12:43\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m**** (Bi-directional) GRU ****\u001b[0m\n",
      "\u001b[92m2020-09-10 22:12:43\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m******************************\u001b[0m\n",
      "\u001b[92m2020-09-10 22:12:43\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mread inputs\u001b[0m\n",
      "\u001b[92m2020-09-10 22:12:43\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mcreate a two_parallel_rnns model\u001b[0m\n",
      "\u001b[92m2020-09-10 22:12:43\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32mstart fitting parameters\u001b[0m\n",
      "\u001b[92m2020-09-10 22:12:43\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mNumber of batches: 1309\u001b[0m\n",
      "\u001b[92m2020-09-10 22:12:43\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mNumber of epochs: 3\u001b[0m\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "7fee201e3de04a6d9aaf4c74b3d709ff",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=3.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=1309.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n",
      "\n",
      "====================\n",
      "Total number of params: 684843\n",
      "\n",
      "two_parallel_rnns (\n",
      "  (emb): Embedding(7542, 60), weights=((7542, 60),), parameters=452520\n",
      "  (rnn_1): GRU(60, 60, num_layers=2, dropout=0.01, bidirectional=True), weights=((180, 60), (180, 60), (180,), (180,), (180, 60), (180, 60), (180,), (180,), (180, 120), (180, 60), (180,), (180,), (180, 120), (180, 60), (180,), (180,)), parameters=109440\n",
      "  (attn_step1): Linear(in_features=120, out_features=60, bias=True), weights=((60, 120), (60,)), parameters=7260\n",
      "  (attn_step2): Linear(in_features=60, out_features=1, bias=True), weights=((1, 60), (1,)), parameters=61\n",
      "  (fc1): Linear(in_features=960, out_features=120, bias=True), weights=((120, 960), (120,)), parameters=115320\n",
      "  (fc2): Linear(in_features=120, out_features=2, bias=True), weights=((2, 120), (2,)), parameters=242\n",
      ")\n",
      "====================\n",
      "\n",
      "\n",
      "\u001b[92m2020-09-10 22:14:04\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_22:14:04 -- Epoch: 1/3; Train; loss: 0.255; acc: 0.895; precision: 0.876, recall: 0.922, macrof1: 0.895, weightedf1: 0.895\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=14.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 22:14:05\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_22:14:05 -- Epoch: 1/3; Valid; loss: 0.135; acc: 0.957; precision: 0.943, recall: 0.974, macrof1: 0.957, weightedf1: 0.957\u001b[0m\n",
      "\u001b[92m2020-09-10 22:14:05\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=1309.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 22:15:25\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_22:15:25 -- Epoch: 2/3; Train; loss: 0.132; acc: 0.955; precision: 0.944, recall: 0.968, macrof1: 0.955, weightedf1: 0.955\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=14.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 22:15:26\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_22:15:26 -- Epoch: 2/3; Valid; loss: 0.106; acc: 0.959; precision: 0.945, recall: 0.974, macrof1: 0.959, weightedf1: 0.959\u001b[0m\n",
      "\u001b[92m2020-09-10 22:15:26\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=1309.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 22:16:46\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_22:16:46 -- Epoch: 3/3; Train; loss: 0.104; acc: 0.965; precision: 0.956, recall: 0.975, macrof1: 0.965, weightedf1: 0.965\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=14.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 22:16:46\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_22:16:46 -- Epoch: 3/3; Valid; loss: 0.093; acc: 0.961; precision: 0.951, recall: 0.972, macrof1: 0.961, weightedf1: 0.961\u001b[0m\n",
      "\u001b[92m2020-09-10 22:16:46\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n",
      "\n",
      "\u001b[92m2020-09-10 22:16:46\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model with least valid loss (checkpoint: 3) at ./models/ocr_001b/ocr_001b.model\u001b[0m\n",
      "\n",
      "\n",
      "\n",
      "====================\n",
      "User time: 243.2404\n",
      "====================\n"
     ]
    }
   ],
   "source": [
    "# train a new model\n",
    "dm_train(input_file_path=\"./inputs/input_dfm_b.yaml\", \n",
    "         dataset_path=\"./dataset/ocr_trainval.txt\", \n",
    "         model_name=\"ocr_001b\")\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## skyline2"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 09:45:00\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mread input file: ./inputs/input_dfm.yaml\u001b[0m\n",
      "\u001b[92m2020-09-10 09:45:00\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32mpytorch will use: cuda:1\u001b[0m\n",
      "\u001b[92m2020-09-10 09:45:00\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mread CSV file: ./dataset/wikigaz_en_ocr_trainval.txt\u001b[0m\n",
      "\u001b[92m2020-09-10 09:45:05\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32mnumber of labels, True: 343520 and False: 343521\u001b[0m\n",
      "\u001b[92m2020-09-10 09:45:05\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mSplitting the Dataset\u001b[0m\n",
      "\u001b[92m2020-09-10 09:45:06\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mfinish splitting the Dataset. User time: 0.5975685119628906\u001b[0m\n",
      "\u001b[92m2020-09-10 09:45:06\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msplits are as follow:\n",
      "train    480927\n",
      "val      206112\n",
      "test          2\n",
      "Name: split, dtype: int64\u001b[0m\n",
      "\u001b[92m2020-09-10 09:45:06\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mstart creating a lookup table and convert characters to indices\u001b[0m\n",
      "\u001b[92m2020-09-10 09:45:07\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32m-- create vocabulary\u001b[0m\n",
      "\u001b[92m2020-09-10 09:45:17\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32m-- convert tokens to indices\u001b[0m\n",
      "\u001b[92m2020-09-10 09:45:17\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32m-- create a lookup table for tokens\u001b[0m\n",
      "\u001b[92m2020-09-10 09:45:17\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32m-- read list of characters from ./inputs/characters_v001.vocab\u001b[0m\n",
      "\u001b[92m2020-09-10 09:45:17\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32m-- Length of vocabulary: 7542\u001b[0m\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "                                                                       \r"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n",
      "\n",
      "\u001b[92m2020-09-10 09:45:29\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m******************************\u001b[0m\n",
      "\u001b[92m2020-09-10 09:45:29\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m**** (Bi-directional) GRU ****\u001b[0m\n",
      "\u001b[92m2020-09-10 09:45:29\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m******************************\u001b[0m\n",
      "\u001b[92m2020-09-10 09:45:29\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mread inputs\u001b[0m\n",
      "\u001b[92m2020-09-10 09:45:29\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mcreate a two_parallel_rnns model\u001b[0m\n",
      "\u001b[92m2020-09-10 09:45:29\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32mstart fitting parameters\u001b[0m\n",
      "\u001b[92m2020-09-10 09:45:29\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mNumber of batches: 7515\u001b[0m\n",
      "\u001b[92m2020-09-10 09:45:29\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mNumber of epochs: 10\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "4b6f0115ec2e4f2c849016b34f54117c",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=10.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=7515.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n",
      "\n",
      "====================\n",
      "Total number of params: 684843\n",
      "\n",
      "two_parallel_rnns (\n",
      "  (emb): Embedding(7542, 60), weights=((7542, 60),), parameters=452520\n",
      "  (rnn_1): GRU(60, 60, num_layers=2, dropout=0.01, bidirectional=True), weights=((180, 60), (180, 60), (180,), (180,), (180, 60), (180, 60), (180,), (180,), (180, 120), (180, 60), (180,), (180,), (180, 120), (180, 60), (180,), (180,)), parameters=109440\n",
      "  (attn_step1): Linear(in_features=120, out_features=60, bias=True), weights=((60, 120), (60,)), parameters=7260\n",
      "  (attn_step2): Linear(in_features=60, out_features=1, bias=True), weights=((1, 60), (1,)), parameters=61\n",
      "  (fc1): Linear(in_features=960, out_features=120, bias=True), weights=((120, 960), (120,)), parameters=115320\n",
      "  (fc2): Linear(in_features=120, out_features=2, bias=True), weights=((2, 120), (2,)), parameters=242\n",
      ")\n",
      "====================\n",
      "\n",
      "\n",
      "\u001b[92m2020-09-10 10:10:17\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_10:10:17 -- Epoch: 1/10; Train; loss: 0.321; acc: 0.860; precision: 0.849, recall: 0.875, macrof1: 0.860, weightedf1: 0.860\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=3221.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 10:14:06\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_10:14:06 -- Epoch: 1/10; Valid; loss: 0.255; acc: 0.896; precision: 0.909, recall: 0.880, macrof1: 0.896, weightedf1: 0.896\u001b[0m\n",
      "\u001b[92m2020-09-10 10:14:06\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=7515.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 10:32:38\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_10:32:38 -- Epoch: 2/10; Train; loss: 0.219; acc: 0.909; precision: 0.904, recall: 0.914, macrof1: 0.909, weightedf1: 0.909\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=3221.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 10:34:04\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_10:34:04 -- Epoch: 2/10; Valid; loss: 0.212; acc: 0.911; precision: 0.901, recall: 0.924, macrof1: 0.911, weightedf1: 0.911\u001b[0m\n",
      "\u001b[92m2020-09-10 10:34:04\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=7515.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 10:43:38\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_10:43:38 -- Epoch: 3/10; Train; loss: 0.184; acc: 0.924; precision: 0.922, recall: 0.927, macrof1: 0.924, weightedf1: 0.924\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=3221.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 10:45:03\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_10:45:03 -- Epoch: 3/10; Valid; loss: 0.195; acc: 0.919; precision: 0.906, recall: 0.936, macrof1: 0.919, weightedf1: 0.919\u001b[0m\n",
      "\u001b[92m2020-09-10 10:45:03\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=7515.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 10:54:41\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_10:54:41 -- Epoch: 4/10; Train; loss: 0.164; acc: 0.933; precision: 0.931, recall: 0.936, macrof1: 0.933, weightedf1: 0.933\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=3221.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 10:56:05\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_10:56:05 -- Epoch: 4/10; Valid; loss: 0.196; acc: 0.920; precision: 0.903, recall: 0.941, macrof1: 0.920, weightedf1: 0.920\u001b[0m\n",
      "\u001b[92m2020-09-10 10:56:05\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model (early stopped) with least valid loss (checkpoint: 3) at ./models/wikigaz_en_ocr_gru_001/wikigaz_en_ocr_gru_001.model\u001b[0m\n",
      "\u001b[92m2020-09-10 10:56:05\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n",
      "\u001b[92m2020-09-10 10:56:05\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mEarly stopping at epoch: 4, selected epoch: 3\u001b[0m\n",
      "\n",
      "\n",
      "\n",
      "\n",
      "====================\n",
      "User time: 4235.8061\n",
      "====================\n"
     ]
    }
   ],
   "source": [
    "# train a new model\n",
    "dm_train(input_file_path=\"./inputs/input_dfm.yaml\", \n",
    "         dataset_path=\"./dataset/wikigaz_en_ocr_trainval.txt\", \n",
    "         model_name=\"wikigaz_en_ocr_gru_001\")\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## skyline2b"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 76,
   "metadata": {
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 22:16:46\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mread input file: ./inputs/input_dfm_b.yaml\u001b[0m\n",
      "\u001b[92m2020-09-10 22:16:47\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32mpytorch will use: cuda:1\u001b[0m\n",
      "\u001b[92m2020-09-10 22:16:47\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mread CSV file: ./dataset/wikigaz_en_ocr_trainval.txt\u001b[0m\n",
      "\u001b[92m2020-09-10 22:16:52\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32mnumber of labels, True: 343520 and False: 343521\u001b[0m\n",
      "\u001b[92m2020-09-10 22:16:52\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mSplitting the Dataset\u001b[0m\n",
      "\u001b[92m2020-09-10 22:16:53\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mfinish splitting the Dataset. User time: 0.6039412021636963\u001b[0m\n",
      "\u001b[92m2020-09-10 22:16:53\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msplits are as follow:\n",
      "train    680169\n",
      "val        6870\n",
      "test          2\n",
      "Name: split, dtype: int64\u001b[0m\n",
      "\u001b[92m2020-09-10 22:16:53\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mstart creating a lookup table and convert characters to indices\u001b[0m\n",
      "\u001b[92m2020-09-10 22:16:54\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32m-- create vocabulary\u001b[0m\n",
      "\u001b[92m2020-09-10 22:17:04\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32m-- convert tokens to indices\u001b[0m\n",
      "\u001b[92m2020-09-10 22:17:04\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32m-- create a lookup table for tokens\u001b[0m\n",
      "\u001b[92m2020-09-10 22:17:04\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32m-- read list of characters from ./inputs/characters_v001.vocab\u001b[0m\n",
      "\u001b[92m2020-09-10 22:17:04\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32m-- Length of vocabulary: 7542\u001b[0m\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "                                                                       \r"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n",
      "\n",
      "\u001b[92m2020-09-10 22:17:16\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m******************************\u001b[0m\n",
      "\u001b[92m2020-09-10 22:17:16\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m**** (Bi-directional) GRU ****\u001b[0m\n",
      "\u001b[92m2020-09-10 22:17:16\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m******************************\u001b[0m\n",
      "\u001b[92m2020-09-10 22:17:16\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mread inputs\u001b[0m\n",
      "\u001b[92m2020-09-10 22:17:16\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mcreate a two_parallel_rnns model\u001b[0m\n",
      "\u001b[92m2020-09-10 22:17:16\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32mstart fitting parameters\u001b[0m\n",
      "\u001b[92m2020-09-10 22:17:16\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mNumber of batches: 10628\u001b[0m\n",
      "\u001b[92m2020-09-10 22:17:16\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mNumber of epochs: 3\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "4f5d305e05f34b548357e8354b885ce4",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=3.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=10628.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n",
      "\n",
      "====================\n",
      "Total number of params: 684843\n",
      "\n",
      "two_parallel_rnns (\n",
      "  (emb): Embedding(7542, 60), weights=((7542, 60),), parameters=452520\n",
      "  (rnn_1): GRU(60, 60, num_layers=2, dropout=0.01, bidirectional=True), weights=((180, 60), (180, 60), (180,), (180,), (180, 60), (180, 60), (180,), (180,), (180, 120), (180, 60), (180,), (180,), (180, 120), (180, 60), (180,), (180,)), parameters=109440\n",
      "  (attn_step1): Linear(in_features=120, out_features=60, bias=True), weights=((60, 120), (60,)), parameters=7260\n",
      "  (attn_step2): Linear(in_features=60, out_features=1, bias=True), weights=((1, 60), (1,)), parameters=61\n",
      "  (fc1): Linear(in_features=960, out_features=120, bias=True), weights=((120, 960), (120,)), parameters=115320\n",
      "  (fc2): Linear(in_features=120, out_features=2, bias=True), weights=((2, 120), (2,)), parameters=242\n",
      ")\n",
      "====================\n",
      "\n",
      "\n",
      "\u001b[92m2020-09-10 22:31:09\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_22:31:09 -- Epoch: 1/3; Train; loss: 0.302; acc: 0.869; precision: 0.858, recall: 0.883, macrof1: 0.869, weightedf1: 0.869\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=108.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 22:31:12\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_22:31:12 -- Epoch: 1/3; Valid; loss: 0.226; acc: 0.909; precision: 0.915, recall: 0.902, macrof1: 0.909, weightedf1: 0.909\u001b[0m\n",
      "\u001b[92m2020-09-10 22:31:12\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=10628.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 22:44:40\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_22:44:40 -- Epoch: 2/3; Train; loss: 0.205; acc: 0.914; precision: 0.911, recall: 0.918, macrof1: 0.914, weightedf1: 0.914\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=108.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 22:44:43\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_22:44:43 -- Epoch: 2/3; Valid; loss: 0.188; acc: 0.918; precision: 0.922, recall: 0.914, macrof1: 0.918, weightedf1: 0.918\u001b[0m\n",
      "\u001b[92m2020-09-10 22:44:43\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=10628.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 22:58:12\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_22:58:12 -- Epoch: 3/3; Train; loss: 0.175; acc: 0.928; precision: 0.927, recall: 0.930, macrof1: 0.928, weightedf1: 0.928\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=108.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 22:58:14\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_22:58:14 -- Epoch: 3/3; Valid; loss: 0.189; acc: 0.920; precision: 0.901, recall: 0.944, macrof1: 0.920, weightedf1: 0.920\u001b[0m\n",
      "\u001b[92m2020-09-10 22:58:14\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n",
      "\n",
      "\u001b[92m2020-09-10 22:58:14\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model with least valid loss (checkpoint: 2) at ./models/wikigaz_en_ocr_gru_001b/wikigaz_en_ocr_gru_001b.model\u001b[0m\n",
      "\n",
      "\n",
      "\n",
      "====================\n",
      "User time: 2459.0160\n",
      "====================\n"
     ]
    }
   ],
   "source": [
    "# train a new model\n",
    "dm_train(input_file_path=\"./inputs/input_dfm_b.yaml\", \n",
    "         dataset_path=\"./dataset/wikigaz_en_ocr_trainval.txt\", \n",
    "         model_name=\"wikigaz_en_ocr_gru_001b\")\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## baseline1_gru"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 10:56:05\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mread input file: ./inputs/input_dfm.yaml\u001b[0m\n",
      "\u001b[92m2020-09-10 10:56:05\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32mpytorch will use: cuda:1\u001b[0m\n",
      "\u001b[92m2020-09-10 10:56:05\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mread CSV file: ./dataset/wikigaz_en_trainval.txt\u001b[0m\n",
      "\u001b[92m2020-09-10 10:56:10\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32mnumber of labels, True: 301219 and False: 301219\u001b[0m\n",
      "\u001b[92m2020-09-10 10:56:10\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mSplitting the Dataset\u001b[0m\n",
      "\u001b[92m2020-09-10 10:56:11\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mfinish splitting the Dataset. User time: 0.5371513366699219\u001b[0m\n",
      "\u001b[92m2020-09-10 10:56:11\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msplits are as follow:\n",
      "train    421706\n",
      "val      180730\n",
      "test          2\n",
      "Name: split, dtype: int64\u001b[0m\n",
      "\u001b[92m2020-09-10 10:56:11\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mstart creating a lookup table and convert characters to indices\u001b[0m\n",
      "\u001b[92m2020-09-10 10:56:12\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32m-- create vocabulary\u001b[0m\n",
      "\u001b[92m2020-09-10 10:56:20\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32m-- convert tokens to indices\u001b[0m\n",
      "\u001b[92m2020-09-10 10:56:20\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32m-- create a lookup table for tokens\u001b[0m\n",
      "\u001b[92m2020-09-10 10:56:20\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32m-- read list of characters from ./inputs/characters_v001.vocab\u001b[0m\n",
      "\u001b[92m2020-09-10 10:56:20\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32m-- Length of vocabulary: 7542\u001b[0m\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "                                                                       \r"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n",
      "\n",
      "\u001b[92m2020-09-10 10:56:31\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m******************************\u001b[0m\n",
      "\u001b[92m2020-09-10 10:56:31\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m**** (Bi-directional) GRU ****\u001b[0m\n",
      "\u001b[92m2020-09-10 10:56:31\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m******************************\u001b[0m\n",
      "\u001b[92m2020-09-10 10:56:31\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mread inputs\u001b[0m\n",
      "\u001b[92m2020-09-10 10:56:31\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mcreate a two_parallel_rnns model\u001b[0m\n",
      "\u001b[92m2020-09-10 10:56:31\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32mstart fitting parameters\u001b[0m\n",
      "\u001b[92m2020-09-10 10:56:31\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mNumber of batches: 6590\u001b[0m\n",
      "\u001b[92m2020-09-10 10:56:31\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mNumber of epochs: 10\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "0cfc164902224e6faa0d292e61046e43",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=10.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=6590.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n",
      "\n",
      "====================\n",
      "Total number of params: 684843\n",
      "\n",
      "two_parallel_rnns (\n",
      "  (emb): Embedding(7542, 60), weights=((7542, 60),), parameters=452520\n",
      "  (rnn_1): GRU(60, 60, num_layers=2, dropout=0.01, bidirectional=True), weights=((180, 60), (180, 60), (180,), (180,), (180, 60), (180, 60), (180,), (180,), (180, 120), (180, 60), (180,), (180,), (180, 120), (180, 60), (180,), (180,)), parameters=109440\n",
      "  (attn_step1): Linear(in_features=120, out_features=60, bias=True), weights=((60, 120), (60,)), parameters=7260\n",
      "  (attn_step2): Linear(in_features=60, out_features=1, bias=True), weights=((1, 60), (1,)), parameters=61\n",
      "  (fc1): Linear(in_features=960, out_features=120, bias=True), weights=((120, 960), (120,)), parameters=115320\n",
      "  (fc2): Linear(in_features=120, out_features=2, bias=True), weights=((2, 120), (2,)), parameters=242\n",
      ")\n",
      "====================\n",
      "\n",
      "\n",
      "\u001b[92m2020-09-10 11:05:20\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_11:05:20 -- Epoch: 1/10; Train; loss: 0.280; acc: 0.883; precision: 0.875, recall: 0.894, macrof1: 0.883, weightedf1: 0.883\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=2824.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 11:06:31\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_11:06:31 -- Epoch: 1/10; Valid; loss: 0.203; acc: 0.918; precision: 0.912, recall: 0.924, macrof1: 0.918, weightedf1: 0.918\u001b[0m\n",
      "\u001b[92m2020-09-10 11:06:31\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=6590.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 11:14:44\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_11:14:44 -- Epoch: 2/10; Train; loss: 0.181; acc: 0.927; precision: 0.926, recall: 0.928, macrof1: 0.927, weightedf1: 0.927\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=2824.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 11:15:55\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_11:15:55 -- Epoch: 2/10; Valid; loss: 0.173; acc: 0.932; precision: 0.931, recall: 0.932, macrof1: 0.932, weightedf1: 0.932\u001b[0m\n",
      "\u001b[92m2020-09-10 11:15:55\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=6590.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 11:24:08\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_11:24:08 -- Epoch: 3/10; Train; loss: 0.153; acc: 0.939; precision: 0.940, recall: 0.937, macrof1: 0.939, weightedf1: 0.939\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=2824.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 11:25:19\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_11:25:19 -- Epoch: 3/10; Valid; loss: 0.168; acc: 0.934; precision: 0.934, recall: 0.933, macrof1: 0.934, weightedf1: 0.934\u001b[0m\n",
      "\u001b[92m2020-09-10 11:25:19\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=6590.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 11:33:33\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_11:33:33 -- Epoch: 4/10; Train; loss: 0.135; acc: 0.945; precision: 0.947, recall: 0.943, macrof1: 0.945, weightedf1: 0.945\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=2824.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 11:34:43\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_11:34:43 -- Epoch: 4/10; Valid; loss: 0.160; acc: 0.937; precision: 0.936, recall: 0.938, macrof1: 0.937, weightedf1: 0.937\u001b[0m\n",
      "\u001b[92m2020-09-10 11:34:43\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=6590.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 11:42:45\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_11:42:45 -- Epoch: 5/10; Train; loss: 0.123; acc: 0.950; precision: 0.953, recall: 0.948, macrof1: 0.950, weightedf1: 0.950\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=2824.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 11:43:52\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_11:43:52 -- Epoch: 5/10; Valid; loss: 0.167; acc: 0.936; precision: 0.941, recall: 0.930, macrof1: 0.936, weightedf1: 0.936\u001b[0m\n",
      "\u001b[92m2020-09-10 11:43:52\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model (early stopped) with least valid loss (checkpoint: 4) at ./models/wikigaz_en_gru_001/wikigaz_en_gru_001.model\u001b[0m\n",
      "\u001b[92m2020-09-10 11:43:52\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n",
      "\u001b[92m2020-09-10 11:43:52\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mEarly stopping at epoch: 5, selected epoch: 4\u001b[0m\n",
      "\n",
      "\n",
      "\n",
      "\n",
      "====================\n",
      "User time: 2841.4835\n",
      "====================\n"
     ]
    }
   ],
   "source": [
    "# train a new model\n",
    "dm_train(input_file_path=\"./inputs/input_dfm.yaml\", \n",
    "         dataset_path=\"./dataset/wikigaz_en_trainval.txt\", \n",
    "         model_name=\"wikigaz_en_gru_001\")\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## baseline1_lstm"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 11:43:52\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mread input file: ./inputs/input_dfm_lstm.yaml\u001b[0m\n",
      "\u001b[92m2020-09-10 11:43:52\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32mpytorch will use: cuda:1\u001b[0m\n",
      "\u001b[92m2020-09-10 11:43:52\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mread CSV file: ./dataset/wikigaz_en_trainval.txt\u001b[0m\n",
      "\u001b[92m2020-09-10 11:43:57\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32mnumber of labels, True: 301219 and False: 301219\u001b[0m\n",
      "\u001b[92m2020-09-10 11:43:57\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mSplitting the Dataset\u001b[0m\n",
      "\u001b[92m2020-09-10 11:43:58\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mfinish splitting the Dataset. User time: 0.5019485950469971\u001b[0m\n",
      "\u001b[92m2020-09-10 11:43:58\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msplits are as follow:\n",
      "train    421706\n",
      "val      180730\n",
      "test          2\n",
      "Name: split, dtype: int64\u001b[0m\n",
      "\u001b[92m2020-09-10 11:43:58\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mstart creating a lookup table and convert characters to indices\u001b[0m\n",
      "\u001b[92m2020-09-10 11:43:59\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32m-- create vocabulary\u001b[0m\n",
      "\u001b[92m2020-09-10 11:44:07\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32m-- convert tokens to indices\u001b[0m\n",
      "\u001b[92m2020-09-10 11:44:07\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32m-- create a lookup table for tokens\u001b[0m\n",
      "\u001b[92m2020-09-10 11:44:07\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32m-- read list of characters from ./inputs/characters_v001.vocab\u001b[0m\n",
      "\u001b[92m2020-09-10 11:44:07\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32m-- Length of vocabulary: 7542\u001b[0m\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "                                                                       \r"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n",
      "\n",
      "\u001b[92m2020-09-10 11:44:18\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m******************************\u001b[0m\n",
      "\u001b[92m2020-09-10 11:44:18\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m**** (Bi-directional) LSTM ****\u001b[0m\n",
      "\u001b[92m2020-09-10 11:44:18\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m******************************\u001b[0m\n",
      "\u001b[92m2020-09-10 11:44:18\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mread inputs\u001b[0m\n",
      "\u001b[92m2020-09-10 11:44:18\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mcreate a two_parallel_rnns model\u001b[0m\n",
      "\u001b[92m2020-09-10 11:44:18\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32mstart fitting parameters\u001b[0m\n",
      "\u001b[92m2020-09-10 11:44:18\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mNumber of batches: 6590\u001b[0m\n",
      "\u001b[92m2020-09-10 11:44:18\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mNumber of epochs: 10\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "4d048a93850a4541899cecff9a2a69f2",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=10.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=6590.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n",
      "\n",
      "====================\n",
      "Total number of params: 721323\n",
      "\n",
      "two_parallel_rnns (\n",
      "  (emb): Embedding(7542, 60), weights=((7542, 60),), parameters=452520\n",
      "  (rnn_1): LSTM(60, 60, num_layers=2, dropout=0.01, bidirectional=True), weights=((240, 60), (240, 60), (240,), (240,), (240, 60), (240, 60), (240,), (240,), (240, 120), (240, 60), (240,), (240,), (240, 120), (240, 60), (240,), (240,)), parameters=145920\n",
      "  (attn_step1): Linear(in_features=120, out_features=60, bias=True), weights=((60, 120), (60,)), parameters=7260\n",
      "  (attn_step2): Linear(in_features=60, out_features=1, bias=True), weights=((1, 60), (1,)), parameters=61\n",
      "  (fc1): Linear(in_features=960, out_features=120, bias=True), weights=((120, 960), (120,)), parameters=115320\n",
      "  (fc2): Linear(in_features=120, out_features=2, bias=True), weights=((2, 120), (2,)), parameters=242\n",
      ")\n",
      "====================\n",
      "\n",
      "\n",
      "\u001b[92m2020-09-10 11:56:42\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_11:56:42 -- Epoch: 1/10; Train; loss: 0.286; acc: 0.880; precision: 0.871, recall: 0.891, macrof1: 0.880, weightedf1: 0.880\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=2824.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 11:59:05\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_11:59:05 -- Epoch: 1/10; Valid; loss: 0.220; acc: 0.911; precision: 0.906, recall: 0.917, macrof1: 0.911, weightedf1: 0.911\u001b[0m\n",
      "\u001b[92m2020-09-10 11:59:05\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=6590.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 12:15:11\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_12:15:11 -- Epoch: 2/10; Train; loss: 0.194; acc: 0.921; precision: 0.919, recall: 0.923, macrof1: 0.921, weightedf1: 0.921\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=2824.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 12:18:24\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_12:18:24 -- Epoch: 2/10; Valid; loss: 0.189; acc: 0.924; precision: 0.932, recall: 0.915, macrof1: 0.924, weightedf1: 0.924\u001b[0m\n",
      "\u001b[92m2020-09-10 12:18:24\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=6590.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 12:26:57\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_12:26:57 -- Epoch: 3/10; Train; loss: 0.162; acc: 0.934; precision: 0.934, recall: 0.934, macrof1: 0.934, weightedf1: 0.934\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=2824.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 12:28:12\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_12:28:12 -- Epoch: 3/10; Valid; loss: 0.171; acc: 0.931; precision: 0.936, recall: 0.926, macrof1: 0.931, weightedf1: 0.931\u001b[0m\n",
      "\u001b[92m2020-09-10 12:28:12\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=6590.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 12:41:48\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_12:41:48 -- Epoch: 4/10; Train; loss: 0.142; acc: 0.943; precision: 0.944, recall: 0.942, macrof1: 0.943, weightedf1: 0.943\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=2824.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 12:43:03\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_12:43:03 -- Epoch: 4/10; Valid; loss: 0.164; acc: 0.935; precision: 0.934, recall: 0.935, macrof1: 0.935, weightedf1: 0.935\u001b[0m\n",
      "\u001b[92m2020-09-10 12:43:03\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=6590.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 12:58:27\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_12:58:27 -- Epoch: 5/10; Train; loss: 0.128; acc: 0.949; precision: 0.949, recall: 0.948, macrof1: 0.949, weightedf1: 0.949\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=2824.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 13:01:29\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_13:01:29 -- Epoch: 5/10; Valid; loss: 0.163; acc: 0.935; precision: 0.932, recall: 0.939, macrof1: 0.935, weightedf1: 0.935\u001b[0m\n",
      "\u001b[92m2020-09-10 13:01:29\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=6590.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 13:15:44\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_13:15:44 -- Epoch: 6/10; Train; loss: 0.115; acc: 0.954; precision: 0.955, recall: 0.953, macrof1: 0.954, weightedf1: 0.954\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=2824.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 13:19:50\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_13:19:50 -- Epoch: 6/10; Valid; loss: 0.166; acc: 0.938; precision: 0.945, recall: 0.932, macrof1: 0.938, weightedf1: 0.938\u001b[0m\n",
      "\u001b[92m2020-09-10 13:19:50\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model (early stopped) with least valid loss (checkpoint: 5) at ./models/wikigaz_en_lstm_001/wikigaz_en_lstm_001.model\u001b[0m\n",
      "\u001b[92m2020-09-10 13:19:50\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n",
      "\u001b[92m2020-09-10 13:19:50\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mEarly stopping at epoch: 6, selected epoch: 5\u001b[0m\n",
      "\n",
      "\n",
      "\n",
      "\n",
      "====================\n",
      "User time: 5732.4270\n",
      "====================\n"
     ]
    }
   ],
   "source": [
    "# train a new model\n",
    "dm_train(input_file_path=\"./inputs/input_dfm_lstm.yaml\", \n",
    "         dataset_path=\"./dataset/wikigaz_en_trainval.txt\", \n",
    "         model_name=\"wikigaz_en_lstm_001\")\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## baseline1_rnn"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 13:19:50\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mread input file: ./inputs/input_dfm_rnn.yaml\u001b[0m\n",
      "\u001b[92m2020-09-10 13:19:50\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32mpytorch will use: cuda:1\u001b[0m\n",
      "\u001b[92m2020-09-10 13:19:50\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mread CSV file: ./dataset/wikigaz_en_trainval.txt\u001b[0m\n",
      "\u001b[92m2020-09-10 13:19:55\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32mnumber of labels, True: 301219 and False: 301219\u001b[0m\n",
      "\u001b[92m2020-09-10 13:19:55\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mSplitting the Dataset\u001b[0m\n",
      "\u001b[92m2020-09-10 13:19:56\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mfinish splitting the Dataset. User time: 0.498227596282959\u001b[0m\n",
      "\u001b[92m2020-09-10 13:19:56\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msplits are as follow:\n",
      "train    421706\n",
      "val      180730\n",
      "test          2\n",
      "Name: split, dtype: int64\u001b[0m\n",
      "\u001b[92m2020-09-10 13:19:56\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mstart creating a lookup table and convert characters to indices\u001b[0m\n",
      "\u001b[92m2020-09-10 13:19:57\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32m-- create vocabulary\u001b[0m\n",
      "\u001b[92m2020-09-10 13:20:05\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32m-- convert tokens to indices\u001b[0m\n",
      "\u001b[92m2020-09-10 13:20:05\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32m-- create a lookup table for tokens\u001b[0m\n",
      "\u001b[92m2020-09-10 13:20:05\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32m-- read list of characters from ./inputs/characters_v001.vocab\u001b[0m\n",
      "\u001b[92m2020-09-10 13:20:05\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32m-- Length of vocabulary: 7542\u001b[0m\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "                                                                       \r"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n",
      "\n",
      "\u001b[92m2020-09-10 13:20:16\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m******************************\u001b[0m\n",
      "\u001b[92m2020-09-10 13:20:16\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m**** (Bi-directional) RNN ****\u001b[0m\n",
      "\u001b[92m2020-09-10 13:20:16\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m******************************\u001b[0m\n",
      "\u001b[92m2020-09-10 13:20:16\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mread inputs\u001b[0m\n",
      "\u001b[92m2020-09-10 13:20:16\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mcreate a two_parallel_rnns model\u001b[0m\n",
      "\u001b[92m2020-09-10 13:20:16\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32mstart fitting parameters\u001b[0m\n",
      "\u001b[92m2020-09-10 13:20:16\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mNumber of batches: 6590\u001b[0m\n",
      "\u001b[92m2020-09-10 13:20:16\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mNumber of epochs: 10\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "edc55b6c99b149aeb22deb6386f247ec",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=10.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=6590.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n",
      "\n",
      "====================\n",
      "Total number of params: 611883\n",
      "\n",
      "two_parallel_rnns (\n",
      "  (emb): Embedding(7542, 60), weights=((7542, 60),), parameters=452520\n",
      "  (rnn_1): RNN(60, 60, num_layers=2, dropout=0.01, bidirectional=True), weights=((60, 60), (60, 60), (60,), (60,), (60, 60), (60, 60), (60,), (60,), (60, 120), (60, 60), (60,), (60,), (60, 120), (60, 60), (60,), (60,)), parameters=36480\n",
      "  (attn_step1): Linear(in_features=120, out_features=60, bias=True), weights=((60, 120), (60,)), parameters=7260\n",
      "  (attn_step2): Linear(in_features=60, out_features=1, bias=True), weights=((1, 60), (1,)), parameters=61\n",
      "  (fc1): Linear(in_features=960, out_features=120, bias=True), weights=((120, 960), (120,)), parameters=115320\n",
      "  (fc2): Linear(in_features=120, out_features=2, bias=True), weights=((2, 120), (2,)), parameters=242\n",
      ")\n",
      "====================\n",
      "\n",
      "\n",
      "\u001b[92m2020-09-10 13:29:41\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_13:29:41 -- Epoch: 1/10; Train; loss: 0.325; acc: 0.861; precision: 0.854, recall: 0.870, macrof1: 0.861, weightedf1: 0.861\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=2824.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 13:30:49\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_13:30:49 -- Epoch: 1/10; Valid; loss: 0.263; acc: 0.891; precision: 0.893, recall: 0.888, macrof1: 0.891, weightedf1: 0.891\u001b[0m\n",
      "\u001b[92m2020-09-10 13:30:49\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=6590.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 13:40:04\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_13:40:04 -- Epoch: 2/10; Train; loss: 0.241; acc: 0.900; precision: 0.905, recall: 0.893, macrof1: 0.900, weightedf1: 0.900\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=2824.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 13:42:13\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_13:42:13 -- Epoch: 2/10; Valid; loss: 0.233; acc: 0.902; precision: 0.902, recall: 0.901, macrof1: 0.902, weightedf1: 0.902\u001b[0m\n",
      "\u001b[92m2020-09-10 13:42:13\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=6590.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 13:54:22\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_13:54:22 -- Epoch: 3/10; Train; loss: 0.218; acc: 0.909; precision: 0.915, recall: 0.901, macrof1: 0.909, weightedf1: 0.909\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=2824.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 13:56:30\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_13:56:30 -- Epoch: 3/10; Valid; loss: 0.223; acc: 0.906; precision: 0.916, recall: 0.896, macrof1: 0.906, weightedf1: 0.906\u001b[0m\n",
      "\u001b[92m2020-09-10 13:56:30\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=6590.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 14:09:19\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_14:09:19 -- Epoch: 4/10; Train; loss: 0.217; acc: 0.910; precision: 0.917, recall: 0.901, macrof1: 0.910, weightedf1: 0.910\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=2824.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 14:10:27\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_14:10:27 -- Epoch: 4/10; Valid; loss: 0.254; acc: 0.894; precision: 0.906, recall: 0.880, macrof1: 0.894, weightedf1: 0.894\u001b[0m\n",
      "\u001b[92m2020-09-10 14:10:27\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model (early stopped) with least valid loss (checkpoint: 3) at ./models/wikigaz_en_rnn_001/wikigaz_en_rnn_001.model\u001b[0m\n",
      "\u001b[92m2020-09-10 14:10:27\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n",
      "\u001b[92m2020-09-10 14:10:27\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mEarly stopping at epoch: 4, selected epoch: 3\u001b[0m\n",
      "\n",
      "\n",
      "\n",
      "\n",
      "====================\n",
      "User time: 3011.0482\n",
      "====================\n"
     ]
    }
   ],
   "source": [
    "# train a new model\n",
    "dm_train(input_file_path=\"./inputs/input_dfm_rnn.yaml\", \n",
    "         dataset_path=\"./dataset/wikigaz_en_trainval.txt\", \n",
    "         model_name=\"wikigaz_en_rnn_001\")\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Fine-Tune, model A, GRU"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 16:12:27\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mread input file: ./inputs/input_dfm_gru_model_A.yaml\u001b[0m\n",
      "\u001b[92m2020-09-10 16:12:27\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32mpytorch will use: cuda:1\u001b[0m\n",
      "\u001b[92m2020-09-10 16:12:27\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mread CSV file: ./dataset/ocr_trainval.txt\u001b[0m\n",
      "\u001b[92m2020-09-10 16:12:27\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32mnumber of labels, True: 42301 and False: 42302\u001b[0m\n",
      "\u001b[92m2020-09-10 16:12:27\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mSplitting the Dataset\u001b[0m\n",
      "\u001b[92m2020-09-10 16:12:27\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mfinish splitting the Dataset. User time: 0.06539654731750488\u001b[0m\n",
      "\u001b[92m2020-09-10 16:12:27\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msplits are as follow:\n",
      "not_assigned    58971\n",
      "val             25380\n",
      "train             250\n",
      "test                2\n",
      "Name: split, dtype: int64\u001b[0m\n",
      "\u001b[92m2020-09-10 16:12:27\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mstart creating a lookup table and convert characters to indices\u001b[0m\n",
      "\u001b[92m2020-09-10 16:12:27\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32m-- create vocabulary\u001b[0m\n",
      "\u001b[92m2020-09-10 16:12:28\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32m-- convert tokens to indices\u001b[0m\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "s1 padding:   0%|          | 0/25380 [00:00<?, ?it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 16:12:29\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mskipping 0 lines\u001b[0m\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "                                                                     \r"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n",
      "============================================================\n",
      "List all parameters in the model\n",
      "============================================================\n",
      "emb.weight False\n",
      "rnn_1.weight_ih_l0 False\n",
      "rnn_1.weight_hh_l0 False\n",
      "rnn_1.bias_ih_l0 False\n",
      "rnn_1.bias_hh_l0 False\n",
      "rnn_1.weight_ih_l0_reverse False\n",
      "rnn_1.weight_hh_l0_reverse False\n",
      "rnn_1.bias_ih_l0_reverse False\n",
      "rnn_1.bias_hh_l0_reverse False\n",
      "rnn_1.weight_ih_l1 False\n",
      "rnn_1.weight_hh_l1 False\n",
      "rnn_1.bias_ih_l1 False\n",
      "rnn_1.bias_hh_l1 False\n",
      "rnn_1.weight_ih_l1_reverse False\n",
      "rnn_1.weight_hh_l1_reverse False\n",
      "rnn_1.bias_ih_l1_reverse False\n",
      "rnn_1.bias_hh_l1_reverse False\n",
      "attn_step1.weight False\n",
      "attn_step1.bias False\n",
      "attn_step2.weight False\n",
      "attn_step2.bias False\n",
      "fc1.weight True\n",
      "fc1.bias True\n",
      "fc2.weight True\n",
      "fc2.bias True\n",
      "============================================================\n",
      "\n",
      "\n",
      "\n",
      "\u001b[92m2020-09-10 16:12:29\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m******************************\u001b[0m\n",
      "\u001b[92m2020-09-10 16:12:29\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m**** (Bi-directional) GRU ****\u001b[0m\n",
      "\u001b[92m2020-09-10 16:12:29\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m******************************\u001b[0m\n",
      "\u001b[92m2020-09-10 16:12:29\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mNumber of batches: 4\u001b[0m\n",
      "\u001b[92m2020-09-10 16:12:29\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mNumber of epochs: 20\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=20.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=4.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n",
      "\n",
      "====================\n",
      "Total number of params: 684843\n",
      "\n",
      "two_parallel_rnns (\n",
      "  (emb): Embedding(7542, 60), weights=((7542, 60),), parameters=452520\n",
      "  (rnn_1): GRU(60, 60, num_layers=2, dropout=0.01, bidirectional=True), weights=((180, 60), (180, 60), (180,), (180,), (180, 60), (180, 60), (180,), (180,), (180, 120), (180, 60), (180,), (180,), (180, 120), (180, 60), (180,), (180,)), parameters=109440\n",
      "  (attn_step1): Linear(in_features=120, out_features=60, bias=True), weights=((60, 120), (60,)), parameters=7260\n",
      "  (attn_step2): Linear(in_features=60, out_features=1, bias=True), weights=((1, 60), (1,)), parameters=61\n",
      "  (fc1): Linear(in_features=960, out_features=120, bias=True), weights=((120, 960), (120,)), parameters=115320\n",
      "  (fc2): Linear(in_features=120, out_features=2, bias=True), weights=((2, 120), (2,)), parameters=242\n",
      ")\n",
      "====================\n",
      "\n",
      "\n",
      "\u001b[92m2020-09-10 16:12:30\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_16:12:30 -- Epoch: 1/20; Train; loss: 1.609; acc: 0.476; precision: 0.474, recall: 0.432, macrof1: 0.475, weightedf1: 0.475\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 16:12:38\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_16:12:38 -- Epoch: 1/20; Valid; loss: 1.509; acc: 0.480; precision: 0.479, recall: 0.467, macrof1: 0.480, weightedf1: 0.480\u001b[0m\n",
      "\u001b[92m2020-09-10 16:12:38\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=4.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 16:12:38\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_16:12:38 -- Epoch: 2/20; Train; loss: 1.279; acc: 0.508; precision: 0.509, recall: 0.456, macrof1: 0.507, weightedf1: 0.507\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 16:12:47\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_16:12:47 -- Epoch: 2/20; Valid; loss: 1.358; acc: 0.495; precision: 0.495, recall: 0.484, macrof1: 0.495, weightedf1: 0.495\u001b[0m\n",
      "\u001b[92m2020-09-10 16:12:47\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=4.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 16:12:47\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_16:12:47 -- Epoch: 3/20; Train; loss: 1.045; acc: 0.556; precision: 0.562, recall: 0.504, macrof1: 0.555, weightedf1: 0.555\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 16:12:56\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_16:12:56 -- Epoch: 3/20; Valid; loss: 1.238; acc: 0.512; precision: 0.512, recall: 0.516, macrof1: 0.512, weightedf1: 0.512\u001b[0m\n",
      "\u001b[92m2020-09-10 16:12:56\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=4.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 16:12:56\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_16:12:56 -- Epoch: 4/20; Train; loss: 0.860; acc: 0.612; precision: 0.625, recall: 0.560, macrof1: 0.611, weightedf1: 0.611\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 16:13:05\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_16:13:05 -- Epoch: 4/20; Valid; loss: 1.147; acc: 0.525; precision: 0.524, recall: 0.537, macrof1: 0.525, weightedf1: 0.525\u001b[0m\n",
      "\u001b[92m2020-09-10 16:13:05\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=4.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 16:13:05\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_16:13:05 -- Epoch: 5/20; Train; loss: 0.741; acc: 0.648; precision: 0.661, recall: 0.608, macrof1: 0.647, weightedf1: 0.647\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 16:13:13\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_16:13:13 -- Epoch: 5/20; Valid; loss: 1.074; acc: 0.538; precision: 0.537, recall: 0.558, macrof1: 0.538, weightedf1: 0.538\u001b[0m\n",
      "\u001b[92m2020-09-10 16:13:13\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=4.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 16:13:14\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_16:13:14 -- Epoch: 6/20; Train; loss: 0.632; acc: 0.692; precision: 0.711, recall: 0.648, macrof1: 0.691, weightedf1: 0.691\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 16:13:22\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_16:13:22 -- Epoch: 6/20; Valid; loss: 1.018; acc: 0.549; precision: 0.546, recall: 0.574, macrof1: 0.548, weightedf1: 0.548\u001b[0m\n",
      "\u001b[92m2020-09-10 16:13:22\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=4.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 16:13:22\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_16:13:22 -- Epoch: 7/20; Train; loss: 0.540; acc: 0.732; precision: 0.750, recall: 0.696, macrof1: 0.732, weightedf1: 0.732\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 16:13:31\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_16:13:31 -- Epoch: 7/20; Valid; loss: 0.976; acc: 0.559; precision: 0.556, recall: 0.588, macrof1: 0.558, weightedf1: 0.558\u001b[0m\n",
      "\u001b[92m2020-09-10 16:13:31\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=4.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 16:13:31\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_16:13:31 -- Epoch: 8/20; Train; loss: 0.475; acc: 0.780; precision: 0.802, recall: 0.744, macrof1: 0.780, weightedf1: 0.780\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 16:13:39\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_16:13:39 -- Epoch: 8/20; Valid; loss: 0.944; acc: 0.566; precision: 0.561, recall: 0.602, macrof1: 0.565, weightedf1: 0.565\u001b[0m\n",
      "\u001b[92m2020-09-10 16:13:39\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=4.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 16:13:40\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_16:13:40 -- Epoch: 9/20; Train; loss: 0.431; acc: 0.800; precision: 0.821, recall: 0.768, macrof1: 0.800, weightedf1: 0.800\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 16:13:48\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_16:13:48 -- Epoch: 9/20; Valid; loss: 0.920; acc: 0.575; precision: 0.568, recall: 0.620, macrof1: 0.574, weightedf1: 0.574\u001b[0m\n",
      "\u001b[92m2020-09-10 16:13:48\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=4.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 16:13:48\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_16:13:48 -- Epoch: 10/20; Train; loss: 0.390; acc: 0.840; precision: 0.840, recall: 0.840, macrof1: 0.840, weightedf1: 0.840\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 16:13:57\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_16:13:57 -- Epoch: 10/20; Valid; loss: 0.903; acc: 0.582; precision: 0.574, recall: 0.631, macrof1: 0.581, weightedf1: 0.581\u001b[0m\n",
      "\u001b[92m2020-09-10 16:13:57\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=4.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 16:13:57\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_16:13:57 -- Epoch: 11/20; Train; loss: 0.352; acc: 0.856; precision: 0.856, recall: 0.856, macrof1: 0.856, weightedf1: 0.856\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 16:14:05\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_16:14:05 -- Epoch: 11/20; Valid; loss: 0.891; acc: 0.588; precision: 0.579, recall: 0.641, macrof1: 0.587, weightedf1: 0.587\u001b[0m\n",
      "\u001b[92m2020-09-10 16:14:05\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=4.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 16:14:06\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_16:14:06 -- Epoch: 12/20; Train; loss: 0.324; acc: 0.868; precision: 0.859, recall: 0.880, macrof1: 0.868, weightedf1: 0.868\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 16:14:14\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_16:14:14 -- Epoch: 12/20; Valid; loss: 0.883; acc: 0.594; precision: 0.584, recall: 0.651, macrof1: 0.592, weightedf1: 0.592\u001b[0m\n",
      "\u001b[92m2020-09-10 16:14:14\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=4.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 16:14:14\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_16:14:14 -- Epoch: 13/20; Train; loss: 0.293; acc: 0.900; precision: 0.891, recall: 0.912, macrof1: 0.900, weightedf1: 0.900\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 16:14:23\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_16:14:23 -- Epoch: 13/20; Valid; loss: 0.878; acc: 0.599; precision: 0.588, recall: 0.662, macrof1: 0.598, weightedf1: 0.598\u001b[0m\n",
      "\u001b[92m2020-09-10 16:14:23\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=4.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 16:14:23\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_16:14:23 -- Epoch: 14/20; Train; loss: 0.266; acc: 0.912; precision: 0.899, recall: 0.928, macrof1: 0.912, weightedf1: 0.912\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 16:14:32\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_16:14:32 -- Epoch: 14/20; Valid; loss: 0.874; acc: 0.604; precision: 0.592, recall: 0.666, macrof1: 0.602, weightedf1: 0.602\u001b[0m\n",
      "\u001b[92m2020-09-10 16:14:32\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=4.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 16:14:32\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_16:14:32 -- Epoch: 15/20; Train; loss: 0.241; acc: 0.928; precision: 0.921, recall: 0.936, macrof1: 0.928, weightedf1: 0.928\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 16:14:40\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_16:14:40 -- Epoch: 15/20; Valid; loss: 0.874; acc: 0.609; precision: 0.596, recall: 0.673, macrof1: 0.607, weightedf1: 0.607\u001b[0m\n",
      "\u001b[92m2020-09-10 16:14:40\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=4.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 16:14:41\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_16:14:41 -- Epoch: 16/20; Train; loss: 0.220; acc: 0.932; precision: 0.909, recall: 0.960, macrof1: 0.932, weightedf1: 0.932\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 16:14:49\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_16:14:49 -- Epoch: 16/20; Valid; loss: 0.875; acc: 0.613; precision: 0.601, recall: 0.675, macrof1: 0.612, weightedf1: 0.612\u001b[0m\n",
      "\u001b[92m2020-09-10 16:14:49\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model (early stopped) with least valid loss (checkpoint: 15) at ./models/wikigaz_en_ft_ocr_gru_v001_n250/wikigaz_en_ft_ocr_gru_v001_n250.model\u001b[0m\n",
      "\u001b[92m2020-09-10 16:14:49\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n",
      "\u001b[92m2020-09-10 16:14:49\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mEarly stopping at epoch: 16, selected epoch: 15\u001b[0m\n",
      "\n",
      "\n",
      "\n",
      "\n",
      "====================\n",
      "User time: 139.8609\n",
      "====================\n"
     ]
    }
   ],
   "source": [
    "from DeezyMatch import finetune as dm_finetune\n",
    "\n",
    "# fine-tune a pretrained model stored at pretrained_model_path and pretrained_vocab_path \n",
    "dm_finetune(input_file_path=\"./inputs/input_dfm_gru_model_A.yaml\", \n",
    "            dataset_path=\"./dataset/ocr_trainval.txt\", \n",
    "            model_name=\"wikigaz_en_ft_ocr_gru_v001_n250\",\n",
    "            pretrained_model_path=\"./models/wikigaz_en_gru_001/wikigaz_en_gru_001.model\", \n",
    "            pretrained_vocab_path=\"./models/wikigaz_en_gru_001/wikigaz_en_gru_001.vocab\",\n",
    "            n_train_examples=250\n",
    "           )"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 16:14:49\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mread input file: ./inputs/input_dfm_gru_model_A.yaml\u001b[0m\n",
      "\u001b[92m2020-09-10 16:14:49\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32mpytorch will use: cuda:1\u001b[0m\n",
      "\u001b[92m2020-09-10 16:14:49\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mread CSV file: ./dataset/ocr_trainval.txt\u001b[0m\n",
      "\u001b[92m2020-09-10 16:14:50\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32mnumber of labels, True: 42301 and False: 42302\u001b[0m\n",
      "\u001b[92m2020-09-10 16:14:50\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mSplitting the Dataset\u001b[0m\n",
      "\u001b[92m2020-09-10 16:14:50\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mfinish splitting the Dataset. User time: 0.06559967994689941\u001b[0m\n",
      "\u001b[92m2020-09-10 16:14:50\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msplits are as follow:\n",
      "not_assigned    58721\n",
      "val             25380\n",
      "train             500\n",
      "test                2\n",
      "Name: split, dtype: int64\u001b[0m\n",
      "\u001b[92m2020-09-10 16:14:50\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mstart creating a lookup table and convert characters to indices\u001b[0m\n",
      "\u001b[92m2020-09-10 16:14:50\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32m-- create vocabulary\u001b[0m\n",
      "\u001b[92m2020-09-10 16:14:51\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32m-- convert tokens to indices\u001b[0m\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "s1 padding:   0%|          | 0/25380 [00:00<?, ?it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 16:14:52\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mskipping 0 lines\u001b[0m\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "                                                                     \r"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n",
      "============================================================\n",
      "List all parameters in the model\n",
      "============================================================\n",
      "emb.weight False\n",
      "rnn_1.weight_ih_l0 False\n",
      "rnn_1.weight_hh_l0 False\n",
      "rnn_1.bias_ih_l0 False\n",
      "rnn_1.bias_hh_l0 False\n",
      "rnn_1.weight_ih_l0_reverse False\n",
      "rnn_1.weight_hh_l0_reverse False\n",
      "rnn_1.bias_ih_l0_reverse False\n",
      "rnn_1.bias_hh_l0_reverse False\n",
      "rnn_1.weight_ih_l1 False\n",
      "rnn_1.weight_hh_l1 False\n",
      "rnn_1.bias_ih_l1 False\n",
      "rnn_1.bias_hh_l1 False\n",
      "rnn_1.weight_ih_l1_reverse False\n",
      "rnn_1.weight_hh_l1_reverse False\n",
      "rnn_1.bias_ih_l1_reverse False\n",
      "rnn_1.bias_hh_l1_reverse False\n",
      "attn_step1.weight False\n",
      "attn_step1.bias False\n",
      "attn_step2.weight False\n",
      "attn_step2.bias False\n",
      "fc1.weight True\n",
      "fc1.bias True\n",
      "fc2.weight True\n",
      "fc2.bias True\n",
      "============================================================\n",
      "\n",
      "\n",
      "\n",
      "\u001b[92m2020-09-10 16:14:52\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m******************************\u001b[0m\n",
      "\u001b[92m2020-09-10 16:14:52\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m**** (Bi-directional) GRU ****\u001b[0m\n",
      "\u001b[92m2020-09-10 16:14:52\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m******************************\u001b[0m\n",
      "\u001b[92m2020-09-10 16:14:52\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mNumber of batches: 8\u001b[0m\n",
      "\u001b[92m2020-09-10 16:14:52\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mNumber of epochs: 20\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=20.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=8.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n",
      "\n",
      "====================\n",
      "Total number of params: 684843\n",
      "\n",
      "two_parallel_rnns (\n",
      "  (emb): Embedding(7542, 60), weights=((7542, 60),), parameters=452520\n",
      "  (rnn_1): GRU(60, 60, num_layers=2, dropout=0.01, bidirectional=True), weights=((180, 60), (180, 60), (180,), (180,), (180, 60), (180, 60), (180,), (180,), (180, 120), (180, 60), (180,), (180,), (180, 120), (180, 60), (180,), (180,)), parameters=109440\n",
      "  (attn_step1): Linear(in_features=120, out_features=60, bias=True), weights=((60, 120), (60,)), parameters=7260\n",
      "  (attn_step2): Linear(in_features=60, out_features=1, bias=True), weights=((1, 60), (1,)), parameters=61\n",
      "  (fc1): Linear(in_features=960, out_features=120, bias=True), weights=((120, 960), (120,)), parameters=115320\n",
      "  (fc2): Linear(in_features=120, out_features=2, bias=True), weights=((2, 120), (2,)), parameters=242\n",
      ")\n",
      "====================\n",
      "\n",
      "\n",
      "\u001b[92m2020-09-10 16:14:53\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_16:14:53 -- Epoch: 1/20; Train; loss: 1.595; acc: 0.450; precision: 0.444, recall: 0.396, macrof1: 0.448, weightedf1: 0.448\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 16:15:01\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_16:15:01 -- Epoch: 1/20; Valid; loss: 1.339; acc: 0.499; precision: 0.499, recall: 0.494, macrof1: 0.499, weightedf1: 0.499\u001b[0m\n",
      "\u001b[92m2020-09-10 16:15:01\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=8.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 16:15:01\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_16:15:01 -- Epoch: 2/20; Train; loss: 1.143; acc: 0.530; precision: 0.531, recall: 0.520, macrof1: 0.530, weightedf1: 0.530\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 16:15:10\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_16:15:10 -- Epoch: 2/20; Valid; loss: 1.110; acc: 0.532; precision: 0.529, recall: 0.582, macrof1: 0.531, weightedf1: 0.531\u001b[0m\n",
      "\u001b[92m2020-09-10 16:15:10\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=8.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 16:15:10\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_16:15:10 -- Epoch: 3/20; Train; loss: 0.893; acc: 0.572; precision: 0.571, recall: 0.576, macrof1: 0.572, weightedf1: 0.572\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 16:15:19\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_16:15:19 -- Epoch: 3/20; Valid; loss: 0.961; acc: 0.556; precision: 0.551, recall: 0.603, macrof1: 0.555, weightedf1: 0.555\u001b[0m\n",
      "\u001b[92m2020-09-10 16:15:19\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=8.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 16:15:19\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_16:15:19 -- Epoch: 4/20; Train; loss: 0.727; acc: 0.624; precision: 0.624, recall: 0.624, macrof1: 0.624, weightedf1: 0.624\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 16:15:28\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_16:15:28 -- Epoch: 4/20; Valid; loss: 0.870; acc: 0.579; precision: 0.572, recall: 0.628, macrof1: 0.578, weightedf1: 0.578\u001b[0m\n",
      "\u001b[92m2020-09-10 16:15:28\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=8.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 16:15:28\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_16:15:28 -- Epoch: 5/20; Train; loss: 0.627; acc: 0.684; precision: 0.687, recall: 0.676, macrof1: 0.684, weightedf1: 0.684\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 16:15:37\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_16:15:37 -- Epoch: 5/20; Valid; loss: 0.806; acc: 0.597; precision: 0.590, recall: 0.633, macrof1: 0.596, weightedf1: 0.596\u001b[0m\n",
      "\u001b[92m2020-09-10 16:15:37\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=8.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 16:15:37\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_16:15:37 -- Epoch: 6/20; Train; loss: 0.556; acc: 0.712; precision: 0.719, recall: 0.696, macrof1: 0.712, weightedf1: 0.712\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 16:15:46\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_16:15:46 -- Epoch: 6/20; Valid; loss: 0.768; acc: 0.613; precision: 0.604, recall: 0.657, macrof1: 0.612, weightedf1: 0.612\u001b[0m\n",
      "\u001b[92m2020-09-10 16:15:46\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=8.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 16:15:46\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_16:15:46 -- Epoch: 7/20; Train; loss: 0.492; acc: 0.752; precision: 0.765, recall: 0.728, macrof1: 0.752, weightedf1: 0.752\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 16:15:55\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_16:15:55 -- Epoch: 7/20; Valid; loss: 0.743; acc: 0.627; precision: 0.616, recall: 0.673, macrof1: 0.626, weightedf1: 0.626\u001b[0m\n",
      "\u001b[92m2020-09-10 16:15:55\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=8.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 16:15:55\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_16:15:55 -- Epoch: 8/20; Train; loss: 0.448; acc: 0.804; precision: 0.817, recall: 0.784, macrof1: 0.804, weightedf1: 0.804\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 16:16:03\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_16:16:03 -- Epoch: 8/20; Valid; loss: 0.726; acc: 0.642; precision: 0.627, recall: 0.699, macrof1: 0.641, weightedf1: 0.641\u001b[0m\n",
      "\u001b[92m2020-09-10 16:16:03\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=8.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 16:16:04\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_16:16:04 -- Epoch: 9/20; Train; loss: 0.401; acc: 0.826; precision: 0.830, recall: 0.820, macrof1: 0.826, weightedf1: 0.826\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 16:16:12\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_16:16:12 -- Epoch: 9/20; Valid; loss: 0.713; acc: 0.652; precision: 0.636, recall: 0.713, macrof1: 0.651, weightedf1: 0.651\u001b[0m\n",
      "\u001b[92m2020-09-10 16:16:12\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=8.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 16:16:12\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_16:16:12 -- Epoch: 10/20; Train; loss: 0.364; acc: 0.844; precision: 0.841, recall: 0.848, macrof1: 0.844, weightedf1: 0.844\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 16:16:21\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_16:16:21 -- Epoch: 10/20; Valid; loss: 0.701; acc: 0.664; precision: 0.650, recall: 0.712, macrof1: 0.663, weightedf1: 0.663\u001b[0m\n",
      "\u001b[92m2020-09-10 16:16:21\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=8.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 16:16:21\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_16:16:21 -- Epoch: 11/20; Train; loss: 0.334; acc: 0.864; precision: 0.876, recall: 0.848, macrof1: 0.864, weightedf1: 0.864\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 16:16:30\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_16:16:30 -- Epoch: 11/20; Valid; loss: 0.694; acc: 0.674; precision: 0.665, recall: 0.703, macrof1: 0.674, weightedf1: 0.674\u001b[0m\n",
      "\u001b[92m2020-09-10 16:16:30\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=8.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 16:16:30\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_16:16:30 -- Epoch: 12/20; Train; loss: 0.301; acc: 0.894; precision: 0.896, recall: 0.892, macrof1: 0.894, weightedf1: 0.894\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 16:16:38\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_16:16:38 -- Epoch: 12/20; Valid; loss: 0.692; acc: 0.684; precision: 0.668, recall: 0.730, macrof1: 0.683, weightedf1: 0.683\u001b[0m\n",
      "\u001b[92m2020-09-10 16:16:38\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=8.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 16:16:39\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_16:16:39 -- Epoch: 13/20; Train; loss: 0.268; acc: 0.916; precision: 0.913, recall: 0.920, macrof1: 0.916, weightedf1: 0.916\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 16:16:47\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_16:16:47 -- Epoch: 13/20; Valid; loss: 0.692; acc: 0.690; precision: 0.674, recall: 0.737, macrof1: 0.689, weightedf1: 0.689\u001b[0m\n",
      "\u001b[92m2020-09-10 16:16:47\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model (early stopped) with least valid loss (checkpoint: 12) at ./models/wikigaz_en_ft_ocr_gru_v001_n500/wikigaz_en_ft_ocr_gru_v001_n500.model\u001b[0m\n",
      "\u001b[92m2020-09-10 16:16:47\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n",
      "\u001b[92m2020-09-10 16:16:47\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mEarly stopping at epoch: 13, selected epoch: 12\u001b[0m\n",
      "\n",
      "\n",
      "\n",
      "\n",
      "====================\n",
      "User time: 115.1197\n",
      "====================\n"
     ]
    }
   ],
   "source": [
    "from DeezyMatch import finetune as dm_finetune\n",
    "\n",
    "n_ft_examples = 500\n",
    "\n",
    "# fine-tune a pretrained model stored at pretrained_model_path and pretrained_vocab_path \n",
    "dm_finetune(input_file_path=\"./inputs/input_dfm_gru_model_A.yaml\", \n",
    "            dataset_path=\"./dataset/ocr_trainval.txt\", \n",
    "            model_name=f\"wikigaz_en_ft_ocr_gru_v001_n{n_ft_examples}\",\n",
    "            pretrained_model_path=\"./models/wikigaz_en_gru_001/wikigaz_en_gru_001.model\", \n",
    "            pretrained_vocab_path=\"./models/wikigaz_en_gru_001/wikigaz_en_gru_001.vocab\",\n",
    "            n_train_examples=n_ft_examples\n",
    "           )"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 16:16:47\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mread input file: ./inputs/input_dfm_gru_model_A.yaml\u001b[0m\n",
      "\u001b[92m2020-09-10 16:16:47\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32mpytorch will use: cuda:1\u001b[0m\n",
      "\u001b[92m2020-09-10 16:16:47\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mread CSV file: ./dataset/ocr_trainval.txt\u001b[0m\n",
      "\u001b[92m2020-09-10 16:16:48\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32mnumber of labels, True: 42301 and False: 42302\u001b[0m\n",
      "\u001b[92m2020-09-10 16:16:48\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mSplitting the Dataset\u001b[0m\n",
      "\u001b[92m2020-09-10 16:16:48\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mfinish splitting the Dataset. User time: 0.0675957202911377\u001b[0m\n",
      "\u001b[92m2020-09-10 16:16:48\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msplits are as follow:\n",
      "not_assigned    58221\n",
      "val             25380\n",
      "train            1000\n",
      "test                2\n",
      "Name: split, dtype: int64\u001b[0m\n",
      "\u001b[92m2020-09-10 16:16:48\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mstart creating a lookup table and convert characters to indices\u001b[0m\n",
      "\u001b[92m2020-09-10 16:16:48\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32m-- create vocabulary\u001b[0m\n",
      "\u001b[92m2020-09-10 16:16:49\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32m-- convert tokens to indices\u001b[0m\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "s1 padding:   0%|          | 0/25380 [00:00<?, ?it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 16:16:50\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mskipping 0 lines\u001b[0m\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "                                                                     \r"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n",
      "============================================================\n",
      "List all parameters in the model\n",
      "============================================================\n",
      "emb.weight False\n",
      "rnn_1.weight_ih_l0 False\n",
      "rnn_1.weight_hh_l0 False\n",
      "rnn_1.bias_ih_l0 False\n",
      "rnn_1.bias_hh_l0 False\n",
      "rnn_1.weight_ih_l0_reverse False\n",
      "rnn_1.weight_hh_l0_reverse False\n",
      "rnn_1.bias_ih_l0_reverse False\n",
      "rnn_1.bias_hh_l0_reverse False\n",
      "rnn_1.weight_ih_l1 False\n",
      "rnn_1.weight_hh_l1 False\n",
      "rnn_1.bias_ih_l1 False\n",
      "rnn_1.bias_hh_l1 False\n",
      "rnn_1.weight_ih_l1_reverse False\n",
      "rnn_1.weight_hh_l1_reverse False\n",
      "rnn_1.bias_ih_l1_reverse False\n",
      "rnn_1.bias_hh_l1_reverse False\n",
      "attn_step1.weight False\n",
      "attn_step1.bias False\n",
      "attn_step2.weight False\n",
      "attn_step2.bias False\n",
      "fc1.weight True\n",
      "fc1.bias True\n",
      "fc2.weight True\n",
      "fc2.bias True\n",
      "============================================================\n",
      "\n",
      "\n",
      "\n",
      "\u001b[92m2020-09-10 16:16:50\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m******************************\u001b[0m\n",
      "\u001b[92m2020-09-10 16:16:50\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m**** (Bi-directional) GRU ****\u001b[0m\n",
      "\u001b[92m2020-09-10 16:16:50\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m******************************\u001b[0m\n",
      "\u001b[92m2020-09-10 16:16:50\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mNumber of batches: 16\u001b[0m\n",
      "\u001b[92m2020-09-10 16:16:50\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mNumber of epochs: 20\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=20.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=16.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n",
      "\n",
      "====================\n",
      "Total number of params: 684843\n",
      "\n",
      "two_parallel_rnns (\n",
      "  (emb): Embedding(7542, 60), weights=((7542, 60),), parameters=452520\n",
      "  (rnn_1): GRU(60, 60, num_layers=2, dropout=0.01, bidirectional=True), weights=((180, 60), (180, 60), (180,), (180,), (180, 60), (180, 60), (180,), (180,), (180, 120), (180, 60), (180,), (180,), (180, 120), (180, 60), (180,), (180,)), parameters=109440\n",
      "  (attn_step1): Linear(in_features=120, out_features=60, bias=True), weights=((60, 120), (60,)), parameters=7260\n",
      "  (attn_step2): Linear(in_features=60, out_features=1, bias=True), weights=((1, 60), (1,)), parameters=61\n",
      "  (fc1): Linear(in_features=960, out_features=120, bias=True), weights=((120, 960), (120,)), parameters=115320\n",
      "  (fc2): Linear(in_features=120, out_features=2, bias=True), weights=((2, 120), (2,)), parameters=242\n",
      ")\n",
      "====================\n",
      "\n",
      "\n",
      "\u001b[92m2020-09-10 16:16:51\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_16:16:51 -- Epoch: 1/20; Train; loss: 1.366; acc: 0.484; precision: 0.484, recall: 0.490, macrof1: 0.484, weightedf1: 0.484\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 16:16:59\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_16:16:59 -- Epoch: 1/20; Valid; loss: 1.083; acc: 0.522; precision: 0.522, recall: 0.515, macrof1: 0.522, weightedf1: 0.522\u001b[0m\n",
      "\u001b[92m2020-09-10 16:16:59\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=16.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 16:17:00\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_16:17:00 -- Epoch: 2/20; Train; loss: 0.862; acc: 0.568; precision: 0.571, recall: 0.548, macrof1: 0.568, weightedf1: 0.568\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 16:17:09\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_16:17:09 -- Epoch: 2/20; Valid; loss: 0.832; acc: 0.572; precision: 0.573, recall: 0.563, macrof1: 0.572, weightedf1: 0.572\u001b[0m\n",
      "\u001b[92m2020-09-10 16:17:09\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=16.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 16:17:09\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_16:17:09 -- Epoch: 3/20; Train; loss: 0.658; acc: 0.646; precision: 0.643, recall: 0.658, macrof1: 0.646, weightedf1: 0.646\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 16:17:18\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_16:17:18 -- Epoch: 3/20; Valid; loss: 0.733; acc: 0.614; precision: 0.605, recall: 0.655, macrof1: 0.613, weightedf1: 0.613\u001b[0m\n",
      "\u001b[92m2020-09-10 16:17:18\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=16.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 16:17:18\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_16:17:18 -- Epoch: 4/20; Train; loss: 0.566; acc: 0.701; precision: 0.695, recall: 0.716, macrof1: 0.701, weightedf1: 0.701\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 16:17:27\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_16:17:27 -- Epoch: 4/20; Valid; loss: 0.680; acc: 0.646; precision: 0.639, recall: 0.668, macrof1: 0.645, weightedf1: 0.645\u001b[0m\n",
      "\u001b[92m2020-09-10 16:17:27\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=16.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 16:17:27\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_16:17:27 -- Epoch: 5/20; Train; loss: 0.500; acc: 0.753; precision: 0.760, recall: 0.740, macrof1: 0.753, weightedf1: 0.753\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 16:17:36\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_16:17:36 -- Epoch: 5/20; Valid; loss: 0.643; acc: 0.673; precision: 0.667, recall: 0.693, macrof1: 0.673, weightedf1: 0.673\u001b[0m\n",
      "\u001b[92m2020-09-10 16:17:36\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=16.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 16:17:36\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_16:17:36 -- Epoch: 6/20; Train; loss: 0.445; acc: 0.800; precision: 0.795, recall: 0.808, macrof1: 0.800, weightedf1: 0.800\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 16:17:45\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_16:17:45 -- Epoch: 6/20; Valid; loss: 0.619; acc: 0.694; precision: 0.688, recall: 0.710, macrof1: 0.694, weightedf1: 0.694\u001b[0m\n",
      "\u001b[92m2020-09-10 16:17:45\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=16.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 16:17:46\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_16:17:46 -- Epoch: 7/20; Train; loss: 0.400; acc: 0.835; precision: 0.825, recall: 0.850, macrof1: 0.835, weightedf1: 0.835\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 16:17:54\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_16:17:54 -- Epoch: 7/20; Valid; loss: 0.603; acc: 0.712; precision: 0.709, recall: 0.720, macrof1: 0.712, weightedf1: 0.712\u001b[0m\n",
      "\u001b[92m2020-09-10 16:17:54\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=16.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 16:17:55\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_16:17:55 -- Epoch: 8/20; Train; loss: 0.351; acc: 0.865; precision: 0.860, recall: 0.872, macrof1: 0.865, weightedf1: 0.865\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 16:18:03\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_16:18:03 -- Epoch: 8/20; Valid; loss: 0.589; acc: 0.724; precision: 0.725, recall: 0.722, macrof1: 0.724, weightedf1: 0.724\u001b[0m\n",
      "\u001b[92m2020-09-10 16:18:03\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=16.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 16:18:04\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_16:18:04 -- Epoch: 9/20; Train; loss: 0.308; acc: 0.894; precision: 0.876, recall: 0.918, macrof1: 0.894, weightedf1: 0.894\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 16:18:12\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_16:18:12 -- Epoch: 9/20; Valid; loss: 0.581; acc: 0.732; precision: 0.730, recall: 0.736, macrof1: 0.732, weightedf1: 0.732\u001b[0m\n",
      "\u001b[92m2020-09-10 16:18:12\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=16.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 16:18:13\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_16:18:13 -- Epoch: 10/20; Train; loss: 0.268; acc: 0.912; precision: 0.901, recall: 0.926, macrof1: 0.912, weightedf1: 0.912\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 16:18:22\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_16:18:22 -- Epoch: 10/20; Valid; loss: 0.573; acc: 0.743; precision: 0.742, recall: 0.744, macrof1: 0.743, weightedf1: 0.743\u001b[0m\n",
      "\u001b[92m2020-09-10 16:18:22\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=16.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 16:18:22\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_16:18:22 -- Epoch: 11/20; Train; loss: 0.231; acc: 0.928; precision: 0.908, recall: 0.952, macrof1: 0.928, weightedf1: 0.928\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 16:18:31\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_16:18:31 -- Epoch: 11/20; Valid; loss: 0.572; acc: 0.747; precision: 0.754, recall: 0.733, macrof1: 0.747, weightedf1: 0.747\u001b[0m\n",
      "\u001b[92m2020-09-10 16:18:31\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=16.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 16:18:31\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_16:18:31 -- Epoch: 12/20; Train; loss: 0.201; acc: 0.949; precision: 0.939, recall: 0.960, macrof1: 0.949, weightedf1: 0.949\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 16:18:39\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_16:18:39 -- Epoch: 12/20; Valid; loss: 0.572; acc: 0.753; precision: 0.762, recall: 0.737, macrof1: 0.753, weightedf1: 0.753\u001b[0m\n",
      "\u001b[92m2020-09-10 16:18:39\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model (early stopped) with least valid loss (checkpoint: 11) at ./models/wikigaz_en_ft_ocr_gru_v001_n1000/wikigaz_en_ft_ocr_gru_v001_n1000.model\u001b[0m\n",
      "\u001b[92m2020-09-10 16:18:40\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n",
      "\u001b[92m2020-09-10 16:18:40\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mEarly stopping at epoch: 12, selected epoch: 11\u001b[0m\n",
      "\n",
      "\n",
      "\n",
      "\n",
      "====================\n",
      "User time: 109.2621\n",
      "====================\n"
     ]
    }
   ],
   "source": [
    "from DeezyMatch import finetune as dm_finetune\n",
    "\n",
    "n_ft_examples = 1000\n",
    "\n",
    "# fine-tune a pretrained model stored at pretrained_model_path and pretrained_vocab_path \n",
    "dm_finetune(input_file_path=\"./inputs/input_dfm_gru_model_A.yaml\", \n",
    "            dataset_path=\"./dataset/ocr_trainval.txt\", \n",
    "            model_name=f\"wikigaz_en_ft_ocr_gru_v001_n{n_ft_examples}\",\n",
    "            pretrained_model_path=\"./models/wikigaz_en_gru_001/wikigaz_en_gru_001.model\", \n",
    "            pretrained_vocab_path=\"./models/wikigaz_en_gru_001/wikigaz_en_gru_001.vocab\",\n",
    "            n_train_examples=n_ft_examples\n",
    "           )"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 16:18:40\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mread input file: ./inputs/input_dfm_gru_model_A.yaml\u001b[0m\n",
      "\u001b[92m2020-09-10 16:18:40\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32mpytorch will use: cuda:1\u001b[0m\n",
      "\u001b[92m2020-09-10 16:18:40\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mread CSV file: ./dataset/ocr_trainval.txt\u001b[0m\n",
      "\u001b[92m2020-09-10 16:18:40\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32mnumber of labels, True: 42301 and False: 42302\u001b[0m\n",
      "\u001b[92m2020-09-10 16:18:40\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mSplitting the Dataset\u001b[0m\n",
      "\u001b[92m2020-09-10 16:18:40\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mfinish splitting the Dataset. User time: 0.060266733169555664\u001b[0m\n",
      "\u001b[92m2020-09-10 16:18:40\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msplits are as follow:\n",
      "not_assigned    57221\n",
      "val             25380\n",
      "train            2000\n",
      "test                2\n",
      "Name: split, dtype: int64\u001b[0m\n",
      "\u001b[92m2020-09-10 16:18:40\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mstart creating a lookup table and convert characters to indices\u001b[0m\n",
      "\u001b[92m2020-09-10 16:18:40\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32m-- create vocabulary\u001b[0m\n",
      "\u001b[92m2020-09-10 16:18:41\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32m-- convert tokens to indices\u001b[0m\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "length s2:   0%|          | 0/25380 [00:00<?, ?it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 16:18:42\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mskipping 0 lines\u001b[0m\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "                                                                     \r"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n",
      "============================================================\n",
      "List all parameters in the model\n",
      "============================================================\n",
      "emb.weight False\n",
      "rnn_1.weight_ih_l0 False\n",
      "rnn_1.weight_hh_l0 False\n",
      "rnn_1.bias_ih_l0 False\n",
      "rnn_1.bias_hh_l0 False\n",
      "rnn_1.weight_ih_l0_reverse False\n",
      "rnn_1.weight_hh_l0_reverse False\n",
      "rnn_1.bias_ih_l0_reverse False\n",
      "rnn_1.bias_hh_l0_reverse False\n",
      "rnn_1.weight_ih_l1 False\n",
      "rnn_1.weight_hh_l1 False\n",
      "rnn_1.bias_ih_l1 False\n",
      "rnn_1.bias_hh_l1 False\n",
      "rnn_1.weight_ih_l1_reverse False\n",
      "rnn_1.weight_hh_l1_reverse False\n",
      "rnn_1.bias_ih_l1_reverse False\n",
      "rnn_1.bias_hh_l1_reverse False\n",
      "attn_step1.weight False\n",
      "attn_step1.bias False\n",
      "attn_step2.weight False\n",
      "attn_step2.bias False\n",
      "fc1.weight True\n",
      "fc1.bias True\n",
      "fc2.weight True\n",
      "fc2.bias True\n",
      "============================================================\n",
      "\n",
      "\n",
      "\n",
      "\u001b[92m2020-09-10 16:18:42\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m******************************\u001b[0m\n",
      "\u001b[92m2020-09-10 16:18:42\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m**** (Bi-directional) GRU ****\u001b[0m\n",
      "\u001b[92m2020-09-10 16:18:42\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m******************************\u001b[0m\n",
      "\u001b[92m2020-09-10 16:18:42\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mNumber of batches: 32\u001b[0m\n",
      "\u001b[92m2020-09-10 16:18:42\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mNumber of epochs: 20\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=20.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=32.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n",
      "\n",
      "====================\n",
      "Total number of params: 684843\n",
      "\n",
      "two_parallel_rnns (\n",
      "  (emb): Embedding(7542, 60), weights=((7542, 60),), parameters=452520\n",
      "  (rnn_1): GRU(60, 60, num_layers=2, dropout=0.01, bidirectional=True), weights=((180, 60), (180, 60), (180,), (180,), (180, 60), (180, 60), (180,), (180,), (180, 120), (180, 60), (180,), (180,), (180, 120), (180, 60), (180,), (180,)), parameters=109440\n",
      "  (attn_step1): Linear(in_features=120, out_features=60, bias=True), weights=((60, 120), (60,)), parameters=7260\n",
      "  (attn_step2): Linear(in_features=60, out_features=1, bias=True), weights=((1, 60), (1,)), parameters=61\n",
      "  (fc1): Linear(in_features=960, out_features=120, bias=True), weights=((120, 960), (120,)), parameters=115320\n",
      "  (fc2): Linear(in_features=120, out_features=2, bias=True), weights=((2, 120), (2,)), parameters=242\n",
      ")\n",
      "====================\n",
      "\n",
      "\n",
      "\u001b[92m2020-09-10 16:18:43\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_16:18:43 -- Epoch: 1/20; Train; loss: 1.167; acc: 0.515; precision: 0.515, recall: 0.521, macrof1: 0.515, weightedf1: 0.515\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 16:18:52\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_16:18:52 -- Epoch: 1/20; Valid; loss: 0.843; acc: 0.569; precision: 0.569, recall: 0.563, macrof1: 0.569, weightedf1: 0.569\u001b[0m\n",
      "\u001b[92m2020-09-10 16:18:52\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=32.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 16:18:53\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_16:18:53 -- Epoch: 2/20; Train; loss: 0.687; acc: 0.634; precision: 0.625, recall: 0.669, macrof1: 0.634, weightedf1: 0.634\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 16:19:01\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_16:19:01 -- Epoch: 2/20; Valid; loss: 0.670; acc: 0.644; precision: 0.633, recall: 0.687, macrof1: 0.644, weightedf1: 0.644\u001b[0m\n",
      "\u001b[92m2020-09-10 16:19:01\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=32.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 16:19:02\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_16:19:02 -- Epoch: 3/20; Train; loss: 0.560; acc: 0.712; precision: 0.710, recall: 0.716, macrof1: 0.712, weightedf1: 0.712\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 16:19:10\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_16:19:10 -- Epoch: 3/20; Valid; loss: 0.606; acc: 0.691; precision: 0.665, recall: 0.770, macrof1: 0.689, weightedf1: 0.689\u001b[0m\n",
      "\u001b[92m2020-09-10 16:19:10\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=32.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 16:19:11\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_16:19:11 -- Epoch: 4/20; Train; loss: 0.492; acc: 0.763; precision: 0.740, recall: 0.812, macrof1: 0.763, weightedf1: 0.763\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 16:19:20\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_16:19:20 -- Epoch: 4/20; Valid; loss: 0.565; acc: 0.727; precision: 0.733, recall: 0.715, macrof1: 0.727, weightedf1: 0.727\u001b[0m\n",
      "\u001b[92m2020-09-10 16:19:20\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=32.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 16:19:21\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_16:19:21 -- Epoch: 5/20; Train; loss: 0.423; acc: 0.814; precision: 0.801, recall: 0.834, macrof1: 0.813, weightedf1: 0.813\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 16:19:29\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_16:19:29 -- Epoch: 5/20; Valid; loss: 0.535; acc: 0.749; precision: 0.736, recall: 0.777, macrof1: 0.749, weightedf1: 0.749\u001b[0m\n",
      "\u001b[92m2020-09-10 16:19:29\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=32.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 16:19:30\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_16:19:30 -- Epoch: 6/20; Train; loss: 0.365; acc: 0.852; precision: 0.833, recall: 0.881, macrof1: 0.852, weightedf1: 0.852\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 16:19:39\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_16:19:39 -- Epoch: 6/20; Valid; loss: 0.515; acc: 0.763; precision: 0.741, recall: 0.809, macrof1: 0.763, weightedf1: 0.763\u001b[0m\n",
      "\u001b[92m2020-09-10 16:19:39\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=32.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 16:19:40\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_16:19:40 -- Epoch: 7/20; Train; loss: 0.307; acc: 0.881; precision: 0.861, recall: 0.908, macrof1: 0.881, weightedf1: 0.881\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 16:19:48\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_16:19:48 -- Epoch: 7/20; Valid; loss: 0.498; acc: 0.779; precision: 0.779, recall: 0.781, macrof1: 0.779, weightedf1: 0.779\u001b[0m\n",
      "\u001b[92m2020-09-10 16:19:48\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=32.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 16:19:49\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_16:19:49 -- Epoch: 8/20; Train; loss: 0.266; acc: 0.908; precision: 0.896, recall: 0.924, macrof1: 0.908, weightedf1: 0.908\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 16:19:57\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_16:19:57 -- Epoch: 8/20; Valid; loss: 0.494; acc: 0.784; precision: 0.776, recall: 0.799, macrof1: 0.784, weightedf1: 0.784\u001b[0m\n",
      "\u001b[92m2020-09-10 16:19:57\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=32.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 16:19:58\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_16:19:58 -- Epoch: 9/20; Train; loss: 0.226; acc: 0.937; precision: 0.918, recall: 0.960, macrof1: 0.937, weightedf1: 0.937\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 16:20:05\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_16:20:05 -- Epoch: 9/20; Valid; loss: 0.491; acc: 0.790; precision: 0.781, recall: 0.805, macrof1: 0.790, weightedf1: 0.790\u001b[0m\n",
      "\u001b[92m2020-09-10 16:20:05\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=32.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 16:20:06\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_16:20:06 -- Epoch: 10/20; Train; loss: 0.186; acc: 0.950; precision: 0.932, recall: 0.970, macrof1: 0.949, weightedf1: 0.949\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 16:20:14\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_16:20:14 -- Epoch: 10/20; Valid; loss: 0.492; acc: 0.793; precision: 0.779, recall: 0.820, macrof1: 0.793, weightedf1: 0.793\u001b[0m\n",
      "\u001b[92m2020-09-10 16:20:14\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model (early stopped) with least valid loss (checkpoint: 9) at ./models/wikigaz_en_ft_ocr_gru_v001_n2000/wikigaz_en_ft_ocr_gru_v001_n2000.model\u001b[0m\n",
      "\u001b[92m2020-09-10 16:20:14\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n",
      "\u001b[92m2020-09-10 16:20:14\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mEarly stopping at epoch: 10, selected epoch: 9\u001b[0m\n",
      "\n",
      "\n",
      "\n",
      "\n",
      "====================\n",
      "User time: 92.1158\n",
      "====================\n"
     ]
    }
   ],
   "source": [
    "from DeezyMatch import finetune as dm_finetune\n",
    "\n",
    "n_ft_examples = 2000\n",
    "\n",
    "# fine-tune a pretrained model stored at pretrained_model_path and pretrained_vocab_path \n",
    "dm_finetune(input_file_path=\"./inputs/input_dfm_gru_model_A.yaml\", \n",
    "            dataset_path=\"./dataset/ocr_trainval.txt\", \n",
    "            model_name=f\"wikigaz_en_ft_ocr_gru_v001_n{n_ft_examples}\",\n",
    "            pretrained_model_path=\"./models/wikigaz_en_gru_001/wikigaz_en_gru_001.model\", \n",
    "            pretrained_vocab_path=\"./models/wikigaz_en_gru_001/wikigaz_en_gru_001.vocab\",\n",
    "            n_train_examples=n_ft_examples\n",
    "           )"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 16:20:14\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mread input file: ./inputs/input_dfm_gru_model_A.yaml\u001b[0m\n",
      "\u001b[92m2020-09-10 16:20:14\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32mpytorch will use: cuda:1\u001b[0m\n",
      "\u001b[92m2020-09-10 16:20:14\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mread CSV file: ./dataset/ocr_trainval.txt\u001b[0m\n",
      "\u001b[92m2020-09-10 16:20:15\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32mnumber of labels, True: 42301 and False: 42302\u001b[0m\n",
      "\u001b[92m2020-09-10 16:20:15\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mSplitting the Dataset\u001b[0m\n",
      "\u001b[92m2020-09-10 16:20:15\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mfinish splitting the Dataset. User time: 0.051637887954711914\u001b[0m\n",
      "\u001b[92m2020-09-10 16:20:15\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msplits are as follow:\n",
      "not_assigned    55221\n",
      "val             25380\n",
      "train            4000\n",
      "test                2\n",
      "Name: split, dtype: int64\u001b[0m\n",
      "\u001b[92m2020-09-10 16:20:15\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mstart creating a lookup table and convert characters to indices\u001b[0m\n",
      "\u001b[92m2020-09-10 16:20:15\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32m-- create vocabulary\u001b[0m\n",
      "\u001b[92m2020-09-10 16:20:16\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32m-- convert tokens to indices\u001b[0m\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "length s2:   0%|          | 0/25380 [00:00<?, ?it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 16:20:17\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mskipping 0 lines\u001b[0m\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "                                                                     \r"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n",
      "============================================================\n",
      "List all parameters in the model\n",
      "============================================================\n",
      "emb.weight False\n",
      "rnn_1.weight_ih_l0 False\n",
      "rnn_1.weight_hh_l0 False\n",
      "rnn_1.bias_ih_l0 False\n",
      "rnn_1.bias_hh_l0 False\n",
      "rnn_1.weight_ih_l0_reverse False\n",
      "rnn_1.weight_hh_l0_reverse False\n",
      "rnn_1.bias_ih_l0_reverse False\n",
      "rnn_1.bias_hh_l0_reverse False\n",
      "rnn_1.weight_ih_l1 False\n",
      "rnn_1.weight_hh_l1 False\n",
      "rnn_1.bias_ih_l1 False\n",
      "rnn_1.bias_hh_l1 False\n",
      "rnn_1.weight_ih_l1_reverse False\n",
      "rnn_1.weight_hh_l1_reverse False\n",
      "rnn_1.bias_ih_l1_reverse False\n",
      "rnn_1.bias_hh_l1_reverse False\n",
      "attn_step1.weight False\n",
      "attn_step1.bias False\n",
      "attn_step2.weight False\n",
      "attn_step2.bias False\n",
      "fc1.weight True\n",
      "fc1.bias True\n",
      "fc2.weight True\n",
      "fc2.bias True\n",
      "============================================================\n",
      "\n",
      "\n",
      "\n",
      "\u001b[92m2020-09-10 16:20:17\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m******************************\u001b[0m\n",
      "\u001b[92m2020-09-10 16:20:17\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m**** (Bi-directional) GRU ****\u001b[0m\n",
      "\u001b[92m2020-09-10 16:20:17\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m******************************\u001b[0m\n",
      "\u001b[92m2020-09-10 16:20:17\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mNumber of batches: 63\u001b[0m\n",
      "\u001b[92m2020-09-10 16:20:17\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mNumber of epochs: 20\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=20.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=63.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n",
      "\n",
      "====================\n",
      "Total number of params: 684843\n",
      "\n",
      "two_parallel_rnns (\n",
      "  (emb): Embedding(7542, 60), weights=((7542, 60),), parameters=452520\n",
      "  (rnn_1): GRU(60, 60, num_layers=2, dropout=0.01, bidirectional=True), weights=((180, 60), (180, 60), (180,), (180,), (180, 60), (180, 60), (180,), (180,), (180, 120), (180, 60), (180,), (180,), (180, 120), (180, 60), (180,), (180,)), parameters=109440\n",
      "  (attn_step1): Linear(in_features=120, out_features=60, bias=True), weights=((60, 120), (60,)), parameters=7260\n",
      "  (attn_step2): Linear(in_features=60, out_features=1, bias=True), weights=((1, 60), (1,)), parameters=61\n",
      "  (fc1): Linear(in_features=960, out_features=120, bias=True), weights=((120, 960), (120,)), parameters=115320\n",
      "  (fc2): Linear(in_features=120, out_features=2, bias=True), weights=((2, 120), (2,)), parameters=242\n",
      ")\n",
      "====================\n",
      "\n",
      "\n",
      "\u001b[92m2020-09-10 16:20:19\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_16:20:19 -- Epoch: 1/20; Train; loss: 0.938; acc: 0.582; precision: 0.581, recall: 0.584, macrof1: 0.581, weightedf1: 0.581\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 16:20:27\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_16:20:27 -- Epoch: 1/20; Valid; loss: 0.660; acc: 0.652; precision: 0.644, recall: 0.679, macrof1: 0.652, weightedf1: 0.652\u001b[0m\n",
      "\u001b[92m2020-09-10 16:20:27\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=63.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 16:20:29\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_16:20:29 -- Epoch: 2/20; Train; loss: 0.549; acc: 0.727; precision: 0.724, recall: 0.734, macrof1: 0.727, weightedf1: 0.727\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 16:20:37\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_16:20:37 -- Epoch: 2/20; Valid; loss: 0.537; acc: 0.745; precision: 0.752, recall: 0.730, macrof1: 0.745, weightedf1: 0.745\u001b[0m\n",
      "\u001b[92m2020-09-10 16:20:37\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=63.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 16:20:39\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_16:20:39 -- Epoch: 3/20; Train; loss: 0.447; acc: 0.795; precision: 0.799, recall: 0.789, macrof1: 0.795, weightedf1: 0.795\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 16:20:46\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_16:20:46 -- Epoch: 3/20; Valid; loss: 0.482; acc: 0.781; precision: 0.786, recall: 0.773, macrof1: 0.781, weightedf1: 0.781\u001b[0m\n",
      "\u001b[92m2020-09-10 16:20:46\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=63.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 16:20:48\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_16:20:48 -- Epoch: 4/20; Train; loss: 0.376; acc: 0.836; precision: 0.832, recall: 0.843, macrof1: 0.836, weightedf1: 0.836\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 16:20:56\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_16:20:56 -- Epoch: 4/20; Valid; loss: 0.447; acc: 0.802; precision: 0.805, recall: 0.798, macrof1: 0.802, weightedf1: 0.802\u001b[0m\n",
      "\u001b[92m2020-09-10 16:20:56\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=63.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 16:20:58\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_16:20:58 -- Epoch: 5/20; Train; loss: 0.324; acc: 0.860; precision: 0.850, recall: 0.873, macrof1: 0.860, weightedf1: 0.860\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 16:21:06\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_16:21:06 -- Epoch: 5/20; Valid; loss: 0.430; acc: 0.814; precision: 0.805, recall: 0.829, macrof1: 0.814, weightedf1: 0.814\u001b[0m\n",
      "\u001b[92m2020-09-10 16:21:06\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=63.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 16:21:08\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_16:21:08 -- Epoch: 6/20; Train; loss: 0.275; acc: 0.895; precision: 0.887, recall: 0.906, macrof1: 0.895, weightedf1: 0.895\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 16:21:16\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_16:21:16 -- Epoch: 6/20; Valid; loss: 0.424; acc: 0.819; precision: 0.801, recall: 0.849, macrof1: 0.819, weightedf1: 0.819\u001b[0m\n",
      "\u001b[92m2020-09-10 16:21:16\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=63.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 16:21:17\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_16:21:17 -- Epoch: 7/20; Train; loss: 0.228; acc: 0.919; precision: 0.906, recall: 0.934, macrof1: 0.919, weightedf1: 0.919\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 16:21:25\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_16:21:25 -- Epoch: 7/20; Valid; loss: 0.418; acc: 0.826; precision: 0.822, recall: 0.832, macrof1: 0.826, weightedf1: 0.826\u001b[0m\n",
      "\u001b[92m2020-09-10 16:21:25\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=63.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 16:21:27\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_16:21:27 -- Epoch: 8/20; Train; loss: 0.187; acc: 0.941; precision: 0.929, recall: 0.954, macrof1: 0.940, weightedf1: 0.940\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 16:21:35\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_16:21:35 -- Epoch: 8/20; Valid; loss: 0.418; acc: 0.830; precision: 0.831, recall: 0.829, macrof1: 0.830, weightedf1: 0.830\u001b[0m\n",
      "\u001b[92m2020-09-10 16:21:35\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model (early stopped) with least valid loss (checkpoint: 7) at ./models/wikigaz_en_ft_ocr_gru_v001_n4000/wikigaz_en_ft_ocr_gru_v001_n4000.model\u001b[0m\n",
      "\u001b[92m2020-09-10 16:21:35\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n",
      "\u001b[92m2020-09-10 16:21:35\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mEarly stopping at epoch: 8, selected epoch: 7\u001b[0m\n",
      "\n",
      "\n",
      "\n",
      "\n",
      "====================\n",
      "User time: 77.7573\n",
      "====================\n"
     ]
    }
   ],
   "source": [
    "from DeezyMatch import finetune as dm_finetune\n",
    "\n",
    "n_ft_examples = 4000\n",
    "\n",
    "# fine-tune a pretrained model stored at pretrained_model_path and pretrained_vocab_path \n",
    "dm_finetune(input_file_path=\"./inputs/input_dfm_gru_model_A.yaml\", \n",
    "            dataset_path=\"./dataset/ocr_trainval.txt\", \n",
    "            model_name=f\"wikigaz_en_ft_ocr_gru_v001_n{n_ft_examples}\",\n",
    "            pretrained_model_path=\"./models/wikigaz_en_gru_001/wikigaz_en_gru_001.model\", \n",
    "            pretrained_vocab_path=\"./models/wikigaz_en_gru_001/wikigaz_en_gru_001.vocab\",\n",
    "            n_train_examples=n_ft_examples\n",
    "           )"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 16:21:35\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mread input file: ./inputs/input_dfm_gru_model_A.yaml\u001b[0m\n",
      "\u001b[92m2020-09-10 16:21:35\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32mpytorch will use: cuda:1\u001b[0m\n",
      "\u001b[92m2020-09-10 16:21:35\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mread CSV file: ./dataset/ocr_trainval.txt\u001b[0m\n",
      "\u001b[92m2020-09-10 16:21:36\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32mnumber of labels, True: 42301 and False: 42302\u001b[0m\n",
      "\u001b[92m2020-09-10 16:21:36\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mSplitting the Dataset\u001b[0m\n",
      "\u001b[92m2020-09-10 16:21:36\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mfinish splitting the Dataset. User time: 0.0495457649230957\u001b[0m\n",
      "\u001b[92m2020-09-10 16:21:36\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msplits are as follow:\n",
      "not_assigned    51221\n",
      "val             25380\n",
      "train            8000\n",
      "test                2\n",
      "Name: split, dtype: int64\u001b[0m\n",
      "\u001b[92m2020-09-10 16:21:36\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mstart creating a lookup table and convert characters to indices\u001b[0m\n",
      "\u001b[92m2020-09-10 16:21:36\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32m-- create vocabulary\u001b[0m\n",
      "\u001b[92m2020-09-10 16:21:37\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32m-- convert tokens to indices\u001b[0m\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "                                                    "
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 16:21:37\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mskipping 0 lines\u001b[0m\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "                                                                     \r"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n",
      "============================================================\n",
      "List all parameters in the model\n",
      "============================================================\n",
      "emb.weight False\n",
      "rnn_1.weight_ih_l0 False\n",
      "rnn_1.weight_hh_l0 False\n",
      "rnn_1.bias_ih_l0 False\n",
      "rnn_1.bias_hh_l0 False\n",
      "rnn_1.weight_ih_l0_reverse False\n",
      "rnn_1.weight_hh_l0_reverse False\n",
      "rnn_1.bias_ih_l0_reverse False\n",
      "rnn_1.bias_hh_l0_reverse False\n",
      "rnn_1.weight_ih_l1 False\n",
      "rnn_1.weight_hh_l1 False\n",
      "rnn_1.bias_ih_l1 False\n",
      "rnn_1.bias_hh_l1 False\n",
      "rnn_1.weight_ih_l1_reverse False\n",
      "rnn_1.weight_hh_l1_reverse False\n",
      "rnn_1.bias_ih_l1_reverse False\n",
      "rnn_1.bias_hh_l1_reverse False\n",
      "attn_step1.weight False\n",
      "attn_step1.bias False\n",
      "attn_step2.weight False\n",
      "attn_step2.bias False\n",
      "fc1.weight True\n",
      "fc1.bias True\n",
      "fc2.weight True\n",
      "fc2.bias True\n",
      "============================================================\n",
      "\n",
      "\n",
      "\n",
      "\u001b[92m2020-09-10 16:21:38\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m******************************\u001b[0m\n",
      "\u001b[92m2020-09-10 16:21:38\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m**** (Bi-directional) GRU ****\u001b[0m\n",
      "\u001b[92m2020-09-10 16:21:38\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m******************************\u001b[0m\n",
      "\u001b[92m2020-09-10 16:21:38\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mNumber of batches: 125\u001b[0m\n",
      "\u001b[92m2020-09-10 16:21:38\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mNumber of epochs: 20\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=20.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=125.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n",
      "\n",
      "====================\n",
      "Total number of params: 684843\n",
      "\n",
      "two_parallel_rnns (\n",
      "  (emb): Embedding(7542, 60), weights=((7542, 60),), parameters=452520\n",
      "  (rnn_1): GRU(60, 60, num_layers=2, dropout=0.01, bidirectional=True), weights=((180, 60), (180, 60), (180,), (180,), (180, 60), (180, 60), (180,), (180,), (180, 120), (180, 60), (180,), (180,), (180, 120), (180, 60), (180,), (180,)), parameters=109440\n",
      "  (attn_step1): Linear(in_features=120, out_features=60, bias=True), weights=((60, 120), (60,)), parameters=7260\n",
      "  (attn_step2): Linear(in_features=60, out_features=1, bias=True), weights=((1, 60), (1,)), parameters=61\n",
      "  (fc1): Linear(in_features=960, out_features=120, bias=True), weights=((120, 960), (120,)), parameters=115320\n",
      "  (fc2): Linear(in_features=120, out_features=2, bias=True), weights=((2, 120), (2,)), parameters=242\n",
      ")\n",
      "====================\n",
      "\n",
      "\n",
      "\u001b[92m2020-09-10 16:21:41\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_16:21:41 -- Epoch: 1/20; Train; loss: 0.786; acc: 0.627; precision: 0.622, recall: 0.648, macrof1: 0.627, weightedf1: 0.627\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 16:21:49\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_16:21:49 -- Epoch: 1/20; Valid; loss: 0.540; acc: 0.738; precision: 0.732, recall: 0.751, macrof1: 0.738, weightedf1: 0.738\u001b[0m\n",
      "\u001b[92m2020-09-10 16:21:49\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=125.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 16:21:53\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_16:21:53 -- Epoch: 2/20; Train; loss: 0.461; acc: 0.788; precision: 0.773, recall: 0.815, macrof1: 0.787, weightedf1: 0.787\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 16:22:01\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_16:22:01 -- Epoch: 2/20; Valid; loss: 0.437; acc: 0.804; precision: 0.785, recall: 0.838, macrof1: 0.804, weightedf1: 0.804\u001b[0m\n",
      "\u001b[92m2020-09-10 16:22:01\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=125.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 16:22:04\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_16:22:04 -- Epoch: 3/20; Train; loss: 0.370; acc: 0.841; precision: 0.827, recall: 0.862, macrof1: 0.841, weightedf1: 0.841\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 16:22:12\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_16:22:12 -- Epoch: 3/20; Valid; loss: 0.396; acc: 0.829; precision: 0.825, recall: 0.836, macrof1: 0.829, weightedf1: 0.829\u001b[0m\n",
      "\u001b[92m2020-09-10 16:22:12\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=125.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 16:22:16\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_16:22:16 -- Epoch: 4/20; Train; loss: 0.309; acc: 0.875; precision: 0.860, recall: 0.895, macrof1: 0.875, weightedf1: 0.875\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 16:22:24\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_16:22:24 -- Epoch: 4/20; Valid; loss: 0.379; acc: 0.837; precision: 0.810, recall: 0.881, macrof1: 0.837, weightedf1: 0.837\u001b[0m\n",
      "\u001b[92m2020-09-10 16:22:24\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=125.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 16:22:27\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_16:22:27 -- Epoch: 5/20; Train; loss: 0.258; acc: 0.898; precision: 0.883, recall: 0.917, macrof1: 0.898, weightedf1: 0.898\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 16:22:35\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_16:22:35 -- Epoch: 5/20; Valid; loss: 0.363; acc: 0.849; precision: 0.838, recall: 0.867, macrof1: 0.849, weightedf1: 0.849\u001b[0m\n",
      "\u001b[92m2020-09-10 16:22:35\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=125.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 16:22:39\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_16:22:39 -- Epoch: 6/20; Train; loss: 0.210; acc: 0.926; precision: 0.915, recall: 0.940, macrof1: 0.926, weightedf1: 0.926\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 16:22:46\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_16:22:46 -- Epoch: 6/20; Valid; loss: 0.362; acc: 0.854; precision: 0.847, recall: 0.863, macrof1: 0.854, weightedf1: 0.854\u001b[0m\n",
      "\u001b[92m2020-09-10 16:22:46\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=125.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 16:22:50\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_16:22:50 -- Epoch: 7/20; Train; loss: 0.172; acc: 0.945; precision: 0.933, recall: 0.958, macrof1: 0.944, weightedf1: 0.944\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 16:22:58\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_16:22:58 -- Epoch: 7/20; Valid; loss: 0.373; acc: 0.854; precision: 0.850, recall: 0.861, macrof1: 0.854, weightedf1: 0.854\u001b[0m\n",
      "\u001b[92m2020-09-10 16:22:58\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model (early stopped) with least valid loss (checkpoint: 6) at ./models/wikigaz_en_ft_ocr_gru_v001_n8000/wikigaz_en_ft_ocr_gru_v001_n8000.model\u001b[0m\n",
      "\u001b[92m2020-09-10 16:22:58\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n",
      "\u001b[92m2020-09-10 16:22:58\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mEarly stopping at epoch: 7, selected epoch: 6\u001b[0m\n",
      "\n",
      "\n",
      "\n",
      "\n",
      "====================\n",
      "User time: 80.4513\n",
      "====================\n"
     ]
    }
   ],
   "source": [
    "from DeezyMatch import finetune as dm_finetune\n",
    "\n",
    "n_ft_examples = 8000\n",
    "\n",
    "# fine-tune a pretrained model stored at pretrained_model_path and pretrained_vocab_path \n",
    "dm_finetune(input_file_path=\"./inputs/input_dfm_gru_model_A.yaml\", \n",
    "            dataset_path=\"./dataset/ocr_trainval.txt\", \n",
    "            model_name=f\"wikigaz_en_ft_ocr_gru_v001_n{n_ft_examples}\",\n",
    "            pretrained_model_path=\"./models/wikigaz_en_gru_001/wikigaz_en_gru_001.model\", \n",
    "            pretrained_vocab_path=\"./models/wikigaz_en_gru_001/wikigaz_en_gru_001.vocab\",\n",
    "            n_train_examples=n_ft_examples\n",
    "           )"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 16:22:58\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mread input file: ./inputs/input_dfm_gru_model_A.yaml\u001b[0m\n",
      "\u001b[92m2020-09-10 16:22:58\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32mpytorch will use: cuda:1\u001b[0m\n",
      "\u001b[92m2020-09-10 16:22:58\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mread CSV file: ./dataset/ocr_trainval.txt\u001b[0m\n",
      "\u001b[92m2020-09-10 16:22:59\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32mnumber of labels, True: 42301 and False: 42302\u001b[0m\n",
      "\u001b[92m2020-09-10 16:22:59\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mSplitting the Dataset\u001b[0m\n",
      "\u001b[92m2020-09-10 16:22:59\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mfinish splitting the Dataset. User time: 0.047933101654052734\u001b[0m\n",
      "\u001b[92m2020-09-10 16:22:59\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msplits are as follow:\n",
      "not_assigned    43221\n",
      "val             25380\n",
      "train           16000\n",
      "test                2\n",
      "Name: split, dtype: int64\u001b[0m\n",
      "\u001b[92m2020-09-10 16:22:59\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mstart creating a lookup table and convert characters to indices\u001b[0m\n",
      "\u001b[92m2020-09-10 16:22:59\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32m-- create vocabulary\u001b[0m\n",
      "\u001b[92m2020-09-10 16:23:00\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32m-- convert tokens to indices\u001b[0m\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "s1 padding:   0%|          | 0/16000 [00:00<?, ?it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 16:23:01\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mskipping 0 lines\u001b[0m\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "                                                                     \r"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n",
      "============================================================\n",
      "List all parameters in the model\n",
      "============================================================\n",
      "emb.weight False\n",
      "rnn_1.weight_ih_l0 False\n",
      "rnn_1.weight_hh_l0 False\n",
      "rnn_1.bias_ih_l0 False\n",
      "rnn_1.bias_hh_l0 False\n",
      "rnn_1.weight_ih_l0_reverse False\n",
      "rnn_1.weight_hh_l0_reverse False\n",
      "rnn_1.bias_ih_l0_reverse False\n",
      "rnn_1.bias_hh_l0_reverse False\n",
      "rnn_1.weight_ih_l1 False\n",
      "rnn_1.weight_hh_l1 False\n",
      "rnn_1.bias_ih_l1 False\n",
      "rnn_1.bias_hh_l1 False\n",
      "rnn_1.weight_ih_l1_reverse False\n",
      "rnn_1.weight_hh_l1_reverse False\n",
      "rnn_1.bias_ih_l1_reverse False\n",
      "rnn_1.bias_hh_l1_reverse False\n",
      "attn_step1.weight False\n",
      "attn_step1.bias False\n",
      "attn_step2.weight False\n",
      "attn_step2.bias False\n",
      "fc1.weight True\n",
      "fc1.bias True\n",
      "fc2.weight True\n",
      "fc2.bias True\n",
      "============================================================\n",
      "\n",
      "\n",
      "\n",
      "\u001b[92m2020-09-10 16:23:01\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m******************************\u001b[0m\n",
      "\u001b[92m2020-09-10 16:23:01\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m**** (Bi-directional) GRU ****\u001b[0m\n",
      "\u001b[92m2020-09-10 16:23:01\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m******************************\u001b[0m\n",
      "\u001b[92m2020-09-10 16:23:01\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mNumber of batches: 250\u001b[0m\n",
      "\u001b[92m2020-09-10 16:23:01\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mNumber of epochs: 20\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=20.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=250.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n",
      "\n",
      "====================\n",
      "Total number of params: 684843\n",
      "\n",
      "two_parallel_rnns (\n",
      "  (emb): Embedding(7542, 60), weights=((7542, 60),), parameters=452520\n",
      "  (rnn_1): GRU(60, 60, num_layers=2, dropout=0.01, bidirectional=True), weights=((180, 60), (180, 60), (180,), (180,), (180, 60), (180, 60), (180,), (180,), (180, 120), (180, 60), (180,), (180,), (180, 120), (180, 60), (180,), (180,)), parameters=109440\n",
      "  (attn_step1): Linear(in_features=120, out_features=60, bias=True), weights=((60, 120), (60,)), parameters=7260\n",
      "  (attn_step2): Linear(in_features=60, out_features=1, bias=True), weights=((1, 60), (1,)), parameters=61\n",
      "  (fc1): Linear(in_features=960, out_features=120, bias=True), weights=((120, 960), (120,)), parameters=115320\n",
      "  (fc2): Linear(in_features=120, out_features=2, bias=True), weights=((2, 120), (2,)), parameters=242\n",
      ")\n",
      "====================\n",
      "\n",
      "\n",
      "\u001b[92m2020-09-10 16:23:08\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_16:23:08 -- Epoch: 1/20; Train; loss: 0.624; acc: 0.705; precision: 0.700, recall: 0.715, macrof1: 0.705, weightedf1: 0.705\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 16:23:16\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_16:23:16 -- Epoch: 1/20; Valid; loss: 0.434; acc: 0.807; precision: 0.837, recall: 0.761, macrof1: 0.806, weightedf1: 0.806\u001b[0m\n",
      "\u001b[92m2020-09-10 16:23:16\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=250.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 16:23:23\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_16:23:23 -- Epoch: 2/20; Train; loss: 0.367; acc: 0.843; precision: 0.834, recall: 0.856, macrof1: 0.843, weightedf1: 0.843\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 16:23:31\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_16:23:31 -- Epoch: 2/20; Valid; loss: 0.356; acc: 0.847; precision: 0.833, recall: 0.869, macrof1: 0.847, weightedf1: 0.847\u001b[0m\n",
      "\u001b[92m2020-09-10 16:23:31\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=250.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 16:23:38\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_16:23:38 -- Epoch: 3/20; Train; loss: 0.295; acc: 0.881; precision: 0.868, recall: 0.898, macrof1: 0.881, weightedf1: 0.881\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 16:23:46\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_16:23:46 -- Epoch: 3/20; Valid; loss: 0.322; acc: 0.865; precision: 0.849, recall: 0.888, macrof1: 0.865, weightedf1: 0.865\u001b[0m\n",
      "\u001b[92m2020-09-10 16:23:46\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=250.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 16:23:53\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_16:23:53 -- Epoch: 4/20; Train; loss: 0.243; acc: 0.906; precision: 0.893, recall: 0.922, macrof1: 0.906, weightedf1: 0.906\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 16:24:01\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_16:24:01 -- Epoch: 4/20; Valid; loss: 0.308; acc: 0.873; precision: 0.855, recall: 0.897, macrof1: 0.873, weightedf1: 0.873\u001b[0m\n",
      "\u001b[92m2020-09-10 16:24:01\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=250.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 16:24:08\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_16:24:08 -- Epoch: 5/20; Train; loss: 0.197; acc: 0.927; precision: 0.918, recall: 0.938, macrof1: 0.927, weightedf1: 0.927\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 16:24:16\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_16:24:16 -- Epoch: 5/20; Valid; loss: 0.307; acc: 0.877; precision: 0.859, recall: 0.903, macrof1: 0.877, weightedf1: 0.877\u001b[0m\n",
      "\u001b[92m2020-09-10 16:24:16\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=250.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 16:24:23\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_16:24:23 -- Epoch: 6/20; Train; loss: 0.158; acc: 0.944; precision: 0.935, recall: 0.954, macrof1: 0.944, weightedf1: 0.944\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 16:24:31\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_16:24:31 -- Epoch: 6/20; Valid; loss: 0.312; acc: 0.878; precision: 0.882, recall: 0.874, macrof1: 0.878, weightedf1: 0.878\u001b[0m\n",
      "\u001b[92m2020-09-10 16:24:31\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model (early stopped) with least valid loss (checkpoint: 5) at ./models/wikigaz_en_ft_ocr_gru_v001_n16000/wikigaz_en_ft_ocr_gru_v001_n16000.model\u001b[0m\n",
      "\u001b[92m2020-09-10 16:24:31\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n",
      "\u001b[92m2020-09-10 16:24:31\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mEarly stopping at epoch: 6, selected epoch: 5\u001b[0m\n",
      "\n",
      "\n",
      "\n",
      "\n",
      "====================\n",
      "User time: 89.9560\n",
      "====================\n"
     ]
    }
   ],
   "source": [
    "from DeezyMatch import finetune as dm_finetune\n",
    "\n",
    "n_ft_examples = 16000\n",
    "\n",
    "# fine-tune a pretrained model stored at pretrained_model_path and pretrained_vocab_path \n",
    "dm_finetune(input_file_path=\"./inputs/input_dfm_gru_model_A.yaml\", \n",
    "            dataset_path=\"./dataset/ocr_trainval.txt\", \n",
    "            model_name=f\"wikigaz_en_ft_ocr_gru_v001_n{n_ft_examples}\",\n",
    "            pretrained_model_path=\"./models/wikigaz_en_gru_001/wikigaz_en_gru_001.model\", \n",
    "            pretrained_vocab_path=\"./models/wikigaz_en_gru_001/wikigaz_en_gru_001.vocab\",\n",
    "            n_train_examples=n_ft_examples\n",
    "           )"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 16:24:31\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mread input file: ./inputs/input_dfm_gru_model_A.yaml\u001b[0m\n",
      "\u001b[92m2020-09-10 16:24:31\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32mpytorch will use: cuda:1\u001b[0m\n",
      "\u001b[92m2020-09-10 16:24:31\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mread CSV file: ./dataset/ocr_trainval.txt\u001b[0m\n",
      "\u001b[92m2020-09-10 16:24:32\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32mnumber of labels, True: 42301 and False: 42302\u001b[0m\n",
      "\u001b[92m2020-09-10 16:24:32\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mSplitting the Dataset\u001b[0m\n",
      "\u001b[92m2020-09-10 16:24:32\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mfinish splitting the Dataset. User time: 0.05030536651611328\u001b[0m\n",
      "\u001b[92m2020-09-10 16:24:32\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msplits are as follow:\n",
      "train           32000\n",
      "not_assigned    27221\n",
      "val             25380\n",
      "test                2\n",
      "Name: split, dtype: int64\u001b[0m\n",
      "\u001b[92m2020-09-10 16:24:32\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mstart creating a lookup table and convert characters to indices\u001b[0m\n",
      "\u001b[92m2020-09-10 16:24:32\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32m-- create vocabulary\u001b[0m\n",
      "\u001b[92m2020-09-10 16:24:33\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32m-- convert tokens to indices\u001b[0m\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "s1 padding:   0%|          | 0/32000 [00:00<?, ?it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 16:24:34\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mskipping 0 lines\u001b[0m\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "                                                                     \r"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n",
      "============================================================\n",
      "List all parameters in the model\n",
      "============================================================\n",
      "emb.weight False\n",
      "rnn_1.weight_ih_l0 False\n",
      "rnn_1.weight_hh_l0 False\n",
      "rnn_1.bias_ih_l0 False\n",
      "rnn_1.bias_hh_l0 False\n",
      "rnn_1.weight_ih_l0_reverse False\n",
      "rnn_1.weight_hh_l0_reverse False\n",
      "rnn_1.bias_ih_l0_reverse False\n",
      "rnn_1.bias_hh_l0_reverse False\n",
      "rnn_1.weight_ih_l1 False\n",
      "rnn_1.weight_hh_l1 False\n",
      "rnn_1.bias_ih_l1 False\n",
      "rnn_1.bias_hh_l1 False\n",
      "rnn_1.weight_ih_l1_reverse False\n",
      "rnn_1.weight_hh_l1_reverse False\n",
      "rnn_1.bias_ih_l1_reverse False\n",
      "rnn_1.bias_hh_l1_reverse False\n",
      "attn_step1.weight False\n",
      "attn_step1.bias False\n",
      "attn_step2.weight False\n",
      "attn_step2.bias False\n",
      "fc1.weight True\n",
      "fc1.bias True\n",
      "fc2.weight True\n",
      "fc2.bias True\n",
      "============================================================\n",
      "\n",
      "\n",
      "\n",
      "\u001b[92m2020-09-10 16:24:35\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m******************************\u001b[0m\n",
      "\u001b[92m2020-09-10 16:24:35\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m**** (Bi-directional) GRU ****\u001b[0m\n",
      "\u001b[92m2020-09-10 16:24:35\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m******************************\u001b[0m\n",
      "\u001b[92m2020-09-10 16:24:35\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mNumber of batches: 500\u001b[0m\n",
      "\u001b[92m2020-09-10 16:24:35\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mNumber of epochs: 20\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=20.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=500.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n",
      "\n",
      "====================\n",
      "Total number of params: 684843\n",
      "\n",
      "two_parallel_rnns (\n",
      "  (emb): Embedding(7542, 60), weights=((7542, 60),), parameters=452520\n",
      "  (rnn_1): GRU(60, 60, num_layers=2, dropout=0.01, bidirectional=True), weights=((180, 60), (180, 60), (180,), (180,), (180, 60), (180, 60), (180,), (180,), (180, 120), (180, 60), (180,), (180,), (180, 120), (180, 60), (180,), (180,)), parameters=109440\n",
      "  (attn_step1): Linear(in_features=120, out_features=60, bias=True), weights=((60, 120), (60,)), parameters=7260\n",
      "  (attn_step2): Linear(in_features=60, out_features=1, bias=True), weights=((1, 60), (1,)), parameters=61\n",
      "  (fc1): Linear(in_features=960, out_features=120, bias=True), weights=((120, 960), (120,)), parameters=115320\n",
      "  (fc2): Linear(in_features=120, out_features=2, bias=True), weights=((2, 120), (2,)), parameters=242\n",
      ")\n",
      "====================\n",
      "\n",
      "\n",
      "\u001b[92m2020-09-10 16:24:48\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_16:24:48 -- Epoch: 1/20; Train; loss: 0.504; acc: 0.767; precision: 0.756, recall: 0.788, macrof1: 0.767, weightedf1: 0.767\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 16:24:56\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_16:24:56 -- Epoch: 1/20; Valid; loss: 0.357; acc: 0.850; precision: 0.838, recall: 0.869, macrof1: 0.850, weightedf1: 0.850\u001b[0m\n",
      "\u001b[92m2020-09-10 16:24:56\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=500.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 16:25:10\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_16:25:10 -- Epoch: 2/20; Train; loss: 0.307; acc: 0.873; precision: 0.861, recall: 0.889, macrof1: 0.873, weightedf1: 0.873\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 16:25:18\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_16:25:18 -- Epoch: 2/20; Valid; loss: 0.300; acc: 0.878; precision: 0.873, recall: 0.883, macrof1: 0.878, weightedf1: 0.878\u001b[0m\n",
      "\u001b[92m2020-09-10 16:25:18\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=500.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 16:25:32\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_16:25:32 -- Epoch: 3/20; Train; loss: 0.245; acc: 0.904; precision: 0.895, recall: 0.916, macrof1: 0.904, weightedf1: 0.904\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 16:25:40\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_16:25:40 -- Epoch: 3/20; Valid; loss: 0.277; acc: 0.887; precision: 0.870, recall: 0.911, macrof1: 0.887, weightedf1: 0.887\u001b[0m\n",
      "\u001b[92m2020-09-10 16:25:40\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=500.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 16:25:54\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_16:25:54 -- Epoch: 4/20; Train; loss: 0.197; acc: 0.925; precision: 0.916, recall: 0.936, macrof1: 0.925, weightedf1: 0.925\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 16:26:02\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_16:26:02 -- Epoch: 4/20; Valid; loss: 0.271; acc: 0.892; precision: 0.886, recall: 0.900, macrof1: 0.892, weightedf1: 0.892\u001b[0m\n",
      "\u001b[92m2020-09-10 16:26:02\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=500.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 16:26:16\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_16:26:16 -- Epoch: 5/20; Train; loss: 0.161; acc: 0.940; precision: 0.932, recall: 0.950, macrof1: 0.940, weightedf1: 0.940\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 16:26:24\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_16:26:24 -- Epoch: 5/20; Valid; loss: 0.268; acc: 0.896; precision: 0.894, recall: 0.898, macrof1: 0.896, weightedf1: 0.896\u001b[0m\n",
      "\u001b[92m2020-09-10 16:26:24\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=500.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 16:26:39\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_16:26:39 -- Epoch: 6/20; Train; loss: 0.128; acc: 0.955; precision: 0.949, recall: 0.962, macrof1: 0.955, weightedf1: 0.955\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 16:26:47\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_16:26:47 -- Epoch: 6/20; Valid; loss: 0.282; acc: 0.895; precision: 0.882, recall: 0.912, macrof1: 0.895, weightedf1: 0.895\u001b[0m\n",
      "\u001b[92m2020-09-10 16:26:47\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model (early stopped) with least valid loss (checkpoint: 5) at ./models/wikigaz_en_ft_ocr_gru_v001_n32000/wikigaz_en_ft_ocr_gru_v001_n32000.model\u001b[0m\n",
      "\u001b[92m2020-09-10 16:26:47\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n",
      "\u001b[92m2020-09-10 16:26:47\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mEarly stopping at epoch: 6, selected epoch: 5\u001b[0m\n",
      "\n",
      "\n",
      "\n",
      "\n",
      "====================\n",
      "User time: 132.1225\n",
      "====================\n"
     ]
    }
   ],
   "source": [
    "from DeezyMatch import finetune as dm_finetune\n",
    "\n",
    "n_ft_examples = 32000\n",
    "\n",
    "# fine-tune a pretrained model stored at pretrained_model_path and pretrained_vocab_path \n",
    "dm_finetune(input_file_path=\"./inputs/input_dfm_gru_model_A.yaml\", \n",
    "            dataset_path=\"./dataset/ocr_trainval.txt\", \n",
    "            model_name=f\"wikigaz_en_ft_ocr_gru_v001_n{n_ft_examples}\",\n",
    "            pretrained_model_path=\"./models/wikigaz_en_gru_001/wikigaz_en_gru_001.model\", \n",
    "            pretrained_vocab_path=\"./models/wikigaz_en_gru_001/wikigaz_en_gru_001.vocab\",\n",
    "            n_train_examples=n_ft_examples\n",
    "           )"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:24:25\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mread input file: ./inputs/input_dfm_gru_model_A_no_early_stopping.yaml\u001b[0m\n",
      "\u001b[92m2020-09-10 17:24:25\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32mpytorch will use: cuda:1\u001b[0m\n",
      "\u001b[92m2020-09-10 17:24:25\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mread CSV file: ./dataset/ocr_trainval.txt\u001b[0m\n",
      "\u001b[92m2020-09-10 17:24:26\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32mnumber of labels, True: 42301 and False: 42302\u001b[0m\n",
      "\u001b[92m2020-09-10 17:24:26\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mSplitting the Dataset\u001b[0m\n",
      "\u001b[92m2020-09-10 17:24:26\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mfinish splitting the Dataset. User time: 0.06342124938964844\u001b[0m\n",
      "\u001b[92m2020-09-10 17:24:26\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msplits are as follow:\n",
      "train    64000\n",
      "val      20603\n",
      "Name: split, dtype: int64\u001b[0m\n",
      "\u001b[92m2020-09-10 17:24:26\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mstart creating a lookup table and convert characters to indices\u001b[0m\n",
      "\u001b[92m2020-09-10 17:24:26\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32m-- create vocabulary\u001b[0m\n",
      "\u001b[92m2020-09-10 17:24:27\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32m-- convert tokens to indices\u001b[0m\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "length s2:   0%|          | 0/64000 [00:00<?, ?it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:24:28\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mskipping 0 lines\u001b[0m\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "                                                                     "
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n",
      "============================================================\n",
      "List all parameters in the model\n",
      "============================================================\n",
      "emb.weight False\n",
      "rnn_1.weight_ih_l0 False\n",
      "rnn_1.weight_hh_l0 False\n",
      "rnn_1.bias_ih_l0 False\n",
      "rnn_1.bias_hh_l0 False\n",
      "rnn_1.weight_ih_l0_reverse False\n",
      "rnn_1.weight_hh_l0_reverse False\n",
      "rnn_1.bias_ih_l0_reverse False\n",
      "rnn_1.bias_hh_l0_reverse False\n",
      "rnn_1.weight_ih_l1 False\n",
      "rnn_1.weight_hh_l1 False\n",
      "rnn_1.bias_ih_l1 False\n",
      "rnn_1.bias_hh_l1 False\n",
      "rnn_1.weight_ih_l1_reverse False\n",
      "rnn_1.weight_hh_l1_reverse False\n",
      "rnn_1.bias_ih_l1_reverse False\n",
      "rnn_1.bias_hh_l1_reverse False\n",
      "attn_step1.weight False\n",
      "attn_step1.bias False\n",
      "attn_step2.weight False\n",
      "attn_step2.bias False\n",
      "fc1.weight True\n",
      "fc1.bias True\n",
      "fc2.weight True\n",
      "fc2.bias True\n",
      "============================================================\n",
      "\n",
      "\n",
      "\n",
      "\u001b[92m2020-09-10 17:24:29\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m******************************\u001b[0m\n",
      "\u001b[92m2020-09-10 17:24:29\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m**** (Bi-directional) GRU ****\u001b[0m\n",
      "\u001b[92m2020-09-10 17:24:29\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m******************************\u001b[0m\n",
      "\u001b[92m2020-09-10 17:24:29\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mNumber of batches: 1000\u001b[0m\n",
      "\u001b[92m2020-09-10 17:24:29\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mNumber of epochs: 10\u001b[0m\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "8d1738f9541646fbafc5a1e3acbe1974",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=10.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=1000.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n",
      "\n",
      "====================\n",
      "Total number of params: 684843\n",
      "\n",
      "two_parallel_rnns (\n",
      "  (emb): Embedding(7542, 60), weights=((7542, 60),), parameters=452520\n",
      "  (rnn_1): GRU(60, 60, num_layers=2, dropout=0.01, bidirectional=True), weights=((180, 60), (180, 60), (180,), (180,), (180, 60), (180, 60), (180,), (180,), (180, 120), (180, 60), (180,), (180,), (180, 120), (180, 60), (180,), (180,)), parameters=109440\n",
      "  (attn_step1): Linear(in_features=120, out_features=60, bias=True), weights=((60, 120), (60,)), parameters=7260\n",
      "  (attn_step2): Linear(in_features=60, out_features=1, bias=True), weights=((1, 60), (1,)), parameters=61\n",
      "  (fc1): Linear(in_features=960, out_features=120, bias=True), weights=((120, 960), (120,)), parameters=115320\n",
      "  (fc2): Linear(in_features=120, out_features=2, bias=True), weights=((2, 120), (2,)), parameters=242\n",
      ")\n",
      "====================\n",
      "\n",
      "\n",
      "\u001b[92m2020-09-10 17:24:55\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_17:24:55 -- Epoch: 1/10; Train; loss: 0.412; acc: 0.818; precision: 0.810, recall: 0.832, macrof1: 0.818, weightedf1: 0.818\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=322.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:25:01\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_17:25:01 -- Epoch: 1/10; Valid; loss: 0.289; acc: 0.883; precision: 0.868, recall: 0.903, macrof1: 0.883, weightedf1: 0.883\u001b[0m\n",
      "\u001b[92m2020-09-10 17:25:01\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=1000.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:25:26\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_17:25:26 -- Epoch: 2/10; Train; loss: 0.253; acc: 0.899; precision: 0.889, recall: 0.912, macrof1: 0.899, weightedf1: 0.899\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=322.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:25:33\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_17:25:33 -- Epoch: 2/10; Valid; loss: 0.245; acc: 0.902; precision: 0.887, recall: 0.921, macrof1: 0.902, weightedf1: 0.902\u001b[0m\n",
      "\u001b[92m2020-09-10 17:25:33\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=1000.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:25:58\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_17:25:58 -- Epoch: 3/10; Train; loss: 0.201; acc: 0.921; precision: 0.913, recall: 0.931, macrof1: 0.921, weightedf1: 0.921\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=322.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:26:04\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_17:26:04 -- Epoch: 3/10; Valid; loss: 0.231; acc: 0.909; precision: 0.888, recall: 0.937, macrof1: 0.909, weightedf1: 0.909\u001b[0m\n",
      "\u001b[92m2020-09-10 17:26:04\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=1000.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:26:30\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_17:26:30 -- Epoch: 4/10; Train; loss: 0.164; acc: 0.937; precision: 0.929, recall: 0.947, macrof1: 0.937, weightedf1: 0.937\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=322.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:26:36\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_17:26:36 -- Epoch: 4/10; Valid; loss: 0.224; acc: 0.914; precision: 0.888, recall: 0.947, macrof1: 0.914, weightedf1: 0.914\u001b[0m\n",
      "\u001b[92m2020-09-10 17:26:36\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=1000.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:27:02\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_17:27:02 -- Epoch: 5/10; Train; loss: 0.137; acc: 0.948; precision: 0.942, recall: 0.955, macrof1: 0.948, weightedf1: 0.948\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=322.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:27:08\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_17:27:08 -- Epoch: 5/10; Valid; loss: 0.226; acc: 0.914; precision: 0.904, recall: 0.927, macrof1: 0.914, weightedf1: 0.914\u001b[0m\n",
      "\u001b[92m2020-09-10 17:27:08\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=1000.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:27:33\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_17:27:33 -- Epoch: 6/10; Train; loss: 0.113; acc: 0.958; precision: 0.952, recall: 0.965, macrof1: 0.958, weightedf1: 0.958\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=322.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:27:39\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_17:27:39 -- Epoch: 6/10; Valid; loss: 0.229; acc: 0.916; precision: 0.907, recall: 0.928, macrof1: 0.916, weightedf1: 0.916\u001b[0m\n",
      "\u001b[92m2020-09-10 17:27:39\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=1000.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:28:05\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_17:28:05 -- Epoch: 7/10; Train; loss: 0.094; acc: 0.967; precision: 0.962, recall: 0.972, macrof1: 0.967, weightedf1: 0.967\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=322.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:28:11\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_17:28:11 -- Epoch: 7/10; Valid; loss: 0.246; acc: 0.914; precision: 0.912, recall: 0.917, macrof1: 0.914, weightedf1: 0.914\u001b[0m\n",
      "\u001b[92m2020-09-10 17:28:11\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=1000.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:28:37\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_17:28:37 -- Epoch: 8/10; Train; loss: 0.078; acc: 0.973; precision: 0.968, recall: 0.978, macrof1: 0.973, weightedf1: 0.973\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=322.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:28:43\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_17:28:43 -- Epoch: 8/10; Valid; loss: 0.258; acc: 0.914; precision: 0.915, recall: 0.914, macrof1: 0.914, weightedf1: 0.914\u001b[0m\n",
      "\u001b[92m2020-09-10 17:28:43\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=1000.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:29:08\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_17:29:08 -- Epoch: 9/10; Train; loss: 0.066; acc: 0.977; precision: 0.973, recall: 0.982, macrof1: 0.977, weightedf1: 0.977\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=322.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:29:15\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_17:29:15 -- Epoch: 9/10; Valid; loss: 0.282; acc: 0.915; precision: 0.893, recall: 0.942, macrof1: 0.915, weightedf1: 0.915\u001b[0m\n",
      "\u001b[92m2020-09-10 17:29:15\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=1000.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:29:42\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_17:29:42 -- Epoch: 10/10; Train; loss: 0.056; acc: 0.981; precision: 0.978, recall: 0.984, macrof1: 0.981, weightedf1: 0.981\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=322.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:29:49\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_17:29:49 -- Epoch: 10/10; Valid; loss: 0.292; acc: 0.915; precision: 0.904, recall: 0.929, macrof1: 0.915, weightedf1: 0.915\u001b[0m\n",
      "\u001b[92m2020-09-10 17:29:49\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n",
      "\n",
      "\u001b[92m2020-09-10 17:29:49\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model with least valid loss (checkpoint: 4) at ./models/wikigaz_en_ft_ocr_gru_v001_n64000/wikigaz_en_ft_ocr_gru_v001_n64000.model\u001b[0m\n",
      "\n",
      "\n",
      "\n",
      "====================\n",
      "User time: 320.0446\n",
      "====================\n"
     ]
    }
   ],
   "source": [
    "from DeezyMatch import finetune as dm_finetune\n",
    "\n",
    "n_ft_examples = 64000\n",
    "\n",
    "# fine-tune a pretrained model stored at pretrained_model_path and pretrained_vocab_path \n",
    "dm_finetune(input_file_path=\"./inputs/input_dfm_gru_model_A_no_early_stopping.yaml\", \n",
    "            dataset_path=\"./dataset/ocr_trainval.txt\", \n",
    "            model_name=f\"wikigaz_en_ft_ocr_gru_v001_n{n_ft_examples}\",\n",
    "            pretrained_model_path=\"./models/wikigaz_en_gru_001/wikigaz_en_gru_001.model\", \n",
    "            pretrained_vocab_path=\"./models/wikigaz_en_gru_001/wikigaz_en_gru_001.vocab\",\n",
    "            n_train_examples=n_ft_examples\n",
    "           )"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:18:30\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mread input file: ./inputs/input_dfm_gru_model_A_no_early_stopping.yaml\u001b[0m\n",
      "\u001b[92m2020-09-10 17:18:30\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32mpytorch will use: cuda:1\u001b[0m\n",
      "\u001b[92m2020-09-10 17:18:30\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mread CSV file: ./dataset/ocr_trainval.txt\u001b[0m\n",
      "\u001b[92m2020-09-10 17:18:30\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32mnumber of labels, True: 42301 and False: 42302\u001b[0m\n",
      "\u001b[92m2020-09-10 17:18:30\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mSplitting the Dataset\u001b[0m\n",
      "\u001b[92m2020-09-10 17:18:31\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mfinish splitting the Dataset. User time: 0.050637245178222656\u001b[0m\n",
      "\u001b[92m2020-09-10 17:18:31\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msplits are as follow:\n",
      "train    84000\n",
      "val        603\n",
      "Name: split, dtype: int64\u001b[0m\n",
      "\u001b[92m2020-09-10 17:18:31\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mstart creating a lookup table and convert characters to indices\u001b[0m\n",
      "\u001b[92m2020-09-10 17:18:31\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32m-- create vocabulary\u001b[0m\n",
      "\u001b[92m2020-09-10 17:18:32\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32m-- convert tokens to indices\u001b[0m\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      "length s1:   0%|          | 0/84000 [00:00<?, ?it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:18:32\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mskipping 0 lines\u001b[0m\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "                                                                     \r"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n",
      "============================================================\n",
      "List all parameters in the model\n",
      "============================================================\n",
      "emb.weight False\n",
      "rnn_1.weight_ih_l0 False\n",
      "rnn_1.weight_hh_l0 False\n",
      "rnn_1.bias_ih_l0 False\n",
      "rnn_1.bias_hh_l0 False\n",
      "rnn_1.weight_ih_l0_reverse False\n",
      "rnn_1.weight_hh_l0_reverse False\n",
      "rnn_1.bias_ih_l0_reverse False\n",
      "rnn_1.bias_hh_l0_reverse False\n",
      "rnn_1.weight_ih_l1 False\n",
      "rnn_1.weight_hh_l1 False\n",
      "rnn_1.bias_ih_l1 False\n",
      "rnn_1.bias_hh_l1 False\n",
      "rnn_1.weight_ih_l1_reverse False\n",
      "rnn_1.weight_hh_l1_reverse False\n",
      "rnn_1.bias_ih_l1_reverse False\n",
      "rnn_1.bias_hh_l1_reverse False\n",
      "attn_step1.weight False\n",
      "attn_step1.bias False\n",
      "attn_step2.weight False\n",
      "attn_step2.bias False\n",
      "fc1.weight True\n",
      "fc1.bias True\n",
      "fc2.weight True\n",
      "fc2.bias True\n",
      "============================================================\n",
      "\n",
      "\n",
      "\n",
      "\u001b[92m2020-09-10 17:18:33\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m******************************\u001b[0m\n",
      "\u001b[92m2020-09-10 17:18:33\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m**** (Bi-directional) GRU ****\u001b[0m\n",
      "\u001b[92m2020-09-10 17:18:33\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m******************************\u001b[0m\n",
      "\u001b[92m2020-09-10 17:18:33\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mNumber of batches: 1313\u001b[0m\n",
      "\u001b[92m2020-09-10 17:18:33\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mNumber of epochs: 10\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=10.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=1313.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n",
      "\n",
      "====================\n",
      "Total number of params: 684843\n",
      "\n",
      "two_parallel_rnns (\n",
      "  (emb): Embedding(7542, 60), weights=((7542, 60),), parameters=452520\n",
      "  (rnn_1): GRU(60, 60, num_layers=2, dropout=0.01, bidirectional=True), weights=((180, 60), (180, 60), (180,), (180,), (180, 60), (180, 60), (180,), (180,), (180, 120), (180, 60), (180,), (180,), (180, 120), (180, 60), (180,), (180,)), parameters=109440\n",
      "  (attn_step1): Linear(in_features=120, out_features=60, bias=True), weights=((60, 120), (60,)), parameters=7260\n",
      "  (attn_step2): Linear(in_features=60, out_features=1, bias=True), weights=((1, 60), (1,)), parameters=61\n",
      "  (fc1): Linear(in_features=960, out_features=120, bias=True), weights=((120, 960), (120,)), parameters=115320\n",
      "  (fc2): Linear(in_features=120, out_features=2, bias=True), weights=((2, 120), (2,)), parameters=242\n",
      ")\n",
      "====================\n",
      "\n",
      "\n",
      "\u001b[92m2020-09-10 17:19:10\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_17:19:10 -- Epoch: 1/10; Train; loss: 0.384; acc: 0.834; precision: 0.827, recall: 0.845, macrof1: 0.834, weightedf1: 0.834\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=10.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:19:10\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_17:19:10 -- Epoch: 1/10; Valid; loss: 0.252; acc: 0.910; precision: 0.933, recall: 0.884, macrof1: 0.910, weightedf1: 0.910\u001b[0m\n",
      "\u001b[92m2020-09-10 17:19:10\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=1313.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:19:46\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_17:19:46 -- Epoch: 2/10; Train; loss: 0.235; acc: 0.906; precision: 0.896, recall: 0.919, macrof1: 0.906, weightedf1: 0.906\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=10.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:19:46\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_17:19:46 -- Epoch: 2/10; Valid; loss: 0.191; acc: 0.924; precision: 0.913, recall: 0.937, macrof1: 0.924, weightedf1: 0.924\u001b[0m\n",
      "\u001b[92m2020-09-10 17:19:46\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=1313.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:20:23\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_17:20:23 -- Epoch: 3/10; Train; loss: 0.186; acc: 0.928; precision: 0.920, recall: 0.938, macrof1: 0.928, weightedf1: 0.928\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=10.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:20:23\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_17:20:23 -- Epoch: 3/10; Valid; loss: 0.183; acc: 0.932; precision: 0.933, recall: 0.930, macrof1: 0.932, weightedf1: 0.932\u001b[0m\n",
      "\u001b[92m2020-09-10 17:20:23\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=1313.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:20:59\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_17:20:59 -- Epoch: 4/10; Train; loss: 0.154; acc: 0.941; precision: 0.933, recall: 0.951, macrof1: 0.941, weightedf1: 0.941\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=10.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:20:59\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_17:20:59 -- Epoch: 4/10; Valid; loss: 0.195; acc: 0.930; precision: 0.954, recall: 0.904, macrof1: 0.930, weightedf1: 0.930\u001b[0m\n",
      "\u001b[92m2020-09-10 17:20:59\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=1313.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:21:36\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_17:21:36 -- Epoch: 5/10; Train; loss: 0.130; acc: 0.951; precision: 0.944, recall: 0.959, macrof1: 0.951, weightedf1: 0.951\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=10.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:21:36\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_17:21:36 -- Epoch: 5/10; Valid; loss: 0.185; acc: 0.925; precision: 0.913, recall: 0.940, macrof1: 0.925, weightedf1: 0.925\u001b[0m\n",
      "\u001b[92m2020-09-10 17:21:36\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=1313.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:22:10\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_17:22:10 -- Epoch: 6/10; Train; loss: 0.109; acc: 0.959; precision: 0.953, recall: 0.966, macrof1: 0.959, weightedf1: 0.959\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=10.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:22:10\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_17:22:10 -- Epoch: 6/10; Valid; loss: 0.207; acc: 0.937; precision: 0.928, recall: 0.947, macrof1: 0.937, weightedf1: 0.937\u001b[0m\n",
      "\u001b[92m2020-09-10 17:22:10\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=1313.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:22:44\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_17:22:44 -- Epoch: 7/10; Train; loss: 0.093; acc: 0.966; precision: 0.961, recall: 0.972, macrof1: 0.966, weightedf1: 0.966\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=10.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:22:44\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_17:22:44 -- Epoch: 7/10; Valid; loss: 0.206; acc: 0.925; precision: 0.927, recall: 0.924, macrof1: 0.925, weightedf1: 0.925\u001b[0m\n",
      "\u001b[92m2020-09-10 17:22:44\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=1313.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:23:18\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_17:23:18 -- Epoch: 8/10; Train; loss: 0.079; acc: 0.971; precision: 0.966, recall: 0.976, macrof1: 0.971, weightedf1: 0.971\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=10.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:23:18\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_17:23:18 -- Epoch: 8/10; Valid; loss: 0.210; acc: 0.929; precision: 0.901, recall: 0.963, macrof1: 0.929, weightedf1: 0.929\u001b[0m\n",
      "\u001b[92m2020-09-10 17:23:18\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=1313.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:23:51\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_17:23:51 -- Epoch: 9/10; Train; loss: 0.069; acc: 0.976; precision: 0.972, recall: 0.980, macrof1: 0.975, weightedf1: 0.975\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=10.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:23:52\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_17:23:52 -- Epoch: 9/10; Valid; loss: 0.216; acc: 0.929; precision: 0.919, recall: 0.940, macrof1: 0.929, weightedf1: 0.929\u001b[0m\n",
      "\u001b[92m2020-09-10 17:23:52\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=1313.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:24:25\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_17:24:25 -- Epoch: 10/10; Train; loss: 0.059; acc: 0.980; precision: 0.976, recall: 0.983, macrof1: 0.980, weightedf1: 0.980\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=10.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:24:25\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_17:24:25 -- Epoch: 10/10; Valid; loss: 0.232; acc: 0.937; precision: 0.946, recall: 0.927, macrof1: 0.937, weightedf1: 0.937\u001b[0m\n",
      "\u001b[92m2020-09-10 17:24:25\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n",
      "\n",
      "\u001b[92m2020-09-10 17:24:25\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model with least valid loss (checkpoint: 3) at ./models/wikigaz_en_ft_ocr_gru_v001_n84000/wikigaz_en_ft_ocr_gru_v001_n84000.model\u001b[0m\n",
      "\n",
      "\n",
      "\n",
      "====================\n",
      "User time: 351.9607\n",
      "====================\n"
     ]
    }
   ],
   "source": [
    "from DeezyMatch import finetune as dm_finetune\n",
    "\n",
    "n_ft_examples = 84000\n",
    "\n",
    "# fine-tune a pretrained model stored at pretrained_model_path and pretrained_vocab_path \n",
    "dm_finetune(input_file_path=\"./inputs/input_dfm_gru_model_A_no_early_stopping.yaml\", \n",
    "            dataset_path=\"./dataset/ocr_trainval.txt\", \n",
    "            model_name=f\"wikigaz_en_ft_ocr_gru_v001_n{n_ft_examples}\",\n",
    "            pretrained_model_path=\"./models/wikigaz_en_gru_001/wikigaz_en_gru_001.model\", \n",
    "            pretrained_vocab_path=\"./models/wikigaz_en_gru_001/wikigaz_en_gru_001.vocab\",\n",
    "            n_train_examples=n_ft_examples\n",
    "           )"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Fine-Tune, model A, LSTM"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 25,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:31:50\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mread input file: ./inputs/input_dfm_lstm_model_A.yaml\u001b[0m\n",
      "\u001b[92m2020-09-10 17:31:50\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32mpytorch will use: cuda:1\u001b[0m\n",
      "\u001b[92m2020-09-10 17:31:50\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mread CSV file: ./dataset/ocr_trainval.txt\u001b[0m\n",
      "\u001b[92m2020-09-10 17:31:51\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32mnumber of labels, True: 42301 and False: 42302\u001b[0m\n",
      "\u001b[92m2020-09-10 17:31:51\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mSplitting the Dataset\u001b[0m\n",
      "\u001b[92m2020-09-10 17:31:51\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mfinish splitting the Dataset. User time: 0.04691624641418457\u001b[0m\n",
      "\u001b[92m2020-09-10 17:31:51\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msplits are as follow:\n",
      "not_assigned    58971\n",
      "val             25380\n",
      "train             250\n",
      "test                2\n",
      "Name: split, dtype: int64\u001b[0m\n",
      "\u001b[92m2020-09-10 17:31:51\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mstart creating a lookup table and convert characters to indices\u001b[0m\n",
      "\u001b[92m2020-09-10 17:31:51\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32m-- create vocabulary\u001b[0m\n",
      "\u001b[92m2020-09-10 17:31:52\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32m-- convert tokens to indices\u001b[0m\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "s1 padding:   0%|          | 0/25380 [00:00<?, ?it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:31:52\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mskipping 0 lines\u001b[0m\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "                                                                     \r"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n",
      "============================================================\n",
      "List all parameters in the model\n",
      "============================================================\n",
      "emb.weight False\n",
      "rnn_1.weight_ih_l0 False\n",
      "rnn_1.weight_hh_l0 False\n",
      "rnn_1.bias_ih_l0 False\n",
      "rnn_1.bias_hh_l0 False\n",
      "rnn_1.weight_ih_l0_reverse False\n",
      "rnn_1.weight_hh_l0_reverse False\n",
      "rnn_1.bias_ih_l0_reverse False\n",
      "rnn_1.bias_hh_l0_reverse False\n",
      "rnn_1.weight_ih_l1 False\n",
      "rnn_1.weight_hh_l1 False\n",
      "rnn_1.bias_ih_l1 False\n",
      "rnn_1.bias_hh_l1 False\n",
      "rnn_1.weight_ih_l1_reverse False\n",
      "rnn_1.weight_hh_l1_reverse False\n",
      "rnn_1.bias_ih_l1_reverse False\n",
      "rnn_1.bias_hh_l1_reverse False\n",
      "attn_step1.weight False\n",
      "attn_step1.bias False\n",
      "attn_step2.weight False\n",
      "attn_step2.bias False\n",
      "fc1.weight True\n",
      "fc1.bias True\n",
      "fc2.weight True\n",
      "fc2.bias True\n",
      "============================================================\n",
      "\n",
      "\n",
      "\n",
      "\u001b[92m2020-09-10 17:31:53\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m******************************\u001b[0m\n",
      "\u001b[92m2020-09-10 17:31:53\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m**** (Bi-directional) LSTM ****\u001b[0m\n",
      "\u001b[92m2020-09-10 17:31:53\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m******************************\u001b[0m\n",
      "\u001b[92m2020-09-10 17:31:53\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mNumber of batches: 4\u001b[0m\n",
      "\u001b[92m2020-09-10 17:31:53\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mNumber of epochs: 20\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "8eff4b67d48d46a58f7e221eeff271d0",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=20.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=4.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n",
      "\n",
      "====================\n",
      "Total number of params: 721323\n",
      "\n",
      "two_parallel_rnns (\n",
      "  (emb): Embedding(7542, 60), weights=((7542, 60),), parameters=452520\n",
      "  (rnn_1): LSTM(60, 60, num_layers=2, dropout=0.01, bidirectional=True), weights=((240, 60), (240, 60), (240,), (240,), (240, 60), (240, 60), (240,), (240,), (240, 120), (240, 60), (240,), (240,), (240, 120), (240, 60), (240,), (240,)), parameters=145920\n",
      "  (attn_step1): Linear(in_features=120, out_features=60, bias=True), weights=((60, 120), (60,)), parameters=7260\n",
      "  (attn_step2): Linear(in_features=60, out_features=1, bias=True), weights=((1, 60), (1,)), parameters=61\n",
      "  (fc1): Linear(in_features=960, out_features=120, bias=True), weights=((120, 960), (120,)), parameters=115320\n",
      "  (fc2): Linear(in_features=120, out_features=2, bias=True), weights=((2, 120), (2,)), parameters=242\n",
      ")\n",
      "====================\n",
      "\n",
      "\n",
      "\u001b[92m2020-09-10 17:31:53\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_17:31:53 -- Epoch: 1/20; Train; loss: 1.620; acc: 0.448; precision: 0.446, recall: 0.432, macrof1: 0.448, weightedf1: 0.448\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:32:02\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_17:32:02 -- Epoch: 1/20; Valid; loss: 1.699; acc: 0.472; precision: 0.474, recall: 0.509, macrof1: 0.471, weightedf1: 0.471\u001b[0m\n",
      "\u001b[92m2020-09-10 17:32:02\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=4.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:32:02\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_17:32:02 -- Epoch: 2/20; Train; loss: 1.309; acc: 0.492; precision: 0.492, recall: 0.472, macrof1: 0.492, weightedf1: 0.492\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:32:13\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_17:32:13 -- Epoch: 2/20; Valid; loss: 1.589; acc: 0.487; precision: 0.488, recall: 0.526, macrof1: 0.486, weightedf1: 0.486\u001b[0m\n",
      "\u001b[92m2020-09-10 17:32:13\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=4.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:32:13\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_17:32:13 -- Epoch: 3/20; Train; loss: 1.089; acc: 0.524; precision: 0.525, recall: 0.504, macrof1: 0.524, weightedf1: 0.524\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:32:23\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_17:32:23 -- Epoch: 3/20; Valid; loss: 1.493; acc: 0.503; precision: 0.503, recall: 0.549, macrof1: 0.502, weightedf1: 0.502\u001b[0m\n",
      "\u001b[92m2020-09-10 17:32:23\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=4.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:32:23\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_17:32:23 -- Epoch: 4/20; Train; loss: 0.899; acc: 0.620; precision: 0.625, recall: 0.600, macrof1: 0.620, weightedf1: 0.620\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:32:33\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_17:32:33 -- Epoch: 4/20; Valid; loss: 1.411; acc: 0.516; precision: 0.515, recall: 0.566, macrof1: 0.515, weightedf1: 0.515\u001b[0m\n",
      "\u001b[92m2020-09-10 17:32:33\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=4.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:32:33\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_17:32:33 -- Epoch: 5/20; Train; loss: 0.749; acc: 0.684; precision: 0.689, recall: 0.672, macrof1: 0.684, weightedf1: 0.684\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:32:42\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_17:32:42 -- Epoch: 5/20; Valid; loss: 1.343; acc: 0.526; precision: 0.523, recall: 0.578, macrof1: 0.524, weightedf1: 0.524\u001b[0m\n",
      "\u001b[92m2020-09-10 17:32:42\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=4.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:32:43\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_17:32:43 -- Epoch: 6/20; Train; loss: 0.635; acc: 0.732; precision: 0.734, recall: 0.728, macrof1: 0.732, weightedf1: 0.732\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:32:51\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_17:32:51 -- Epoch: 6/20; Valid; loss: 1.286; acc: 0.536; precision: 0.532, recall: 0.600, macrof1: 0.535, weightedf1: 0.535\u001b[0m\n",
      "\u001b[92m2020-09-10 17:32:51\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=4.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:32:51\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_17:32:51 -- Epoch: 7/20; Train; loss: 0.534; acc: 0.772; precision: 0.766, recall: 0.784, macrof1: 0.772, weightedf1: 0.772\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:33:00\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_17:33:00 -- Epoch: 7/20; Valid; loss: 1.241; acc: 0.546; precision: 0.540, recall: 0.616, macrof1: 0.543, weightedf1: 0.543\u001b[0m\n",
      "\u001b[92m2020-09-10 17:33:00\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=4.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:33:00\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_17:33:00 -- Epoch: 8/20; Train; loss: 0.463; acc: 0.804; precision: 0.802, recall: 0.808, macrof1: 0.804, weightedf1: 0.804\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:33:08\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_17:33:08 -- Epoch: 8/20; Valid; loss: 1.201; acc: 0.553; precision: 0.546, recall: 0.627, macrof1: 0.551, weightedf1: 0.551\u001b[0m\n",
      "\u001b[92m2020-09-10 17:33:08\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=4.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:33:09\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_17:33:09 -- Epoch: 9/20; Train; loss: 0.400; acc: 0.860; precision: 0.846, recall: 0.880, macrof1: 0.860, weightedf1: 0.860\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:33:17\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_17:33:17 -- Epoch: 9/20; Valid; loss: 1.168; acc: 0.562; precision: 0.554, recall: 0.638, macrof1: 0.559, weightedf1: 0.559\u001b[0m\n",
      "\u001b[92m2020-09-10 17:33:17\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=4.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:33:17\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_17:33:17 -- Epoch: 10/20; Train; loss: 0.348; acc: 0.884; precision: 0.858, recall: 0.920, macrof1: 0.884, weightedf1: 0.884\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:33:26\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_17:33:26 -- Epoch: 10/20; Valid; loss: 1.141; acc: 0.569; precision: 0.559, recall: 0.646, macrof1: 0.566, weightedf1: 0.566\u001b[0m\n",
      "\u001b[92m2020-09-10 17:33:26\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=4.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:33:26\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_17:33:26 -- Epoch: 11/20; Train; loss: 0.308; acc: 0.896; precision: 0.878, recall: 0.920, macrof1: 0.896, weightedf1: 0.896\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:33:35\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_17:33:35 -- Epoch: 11/20; Valid; loss: 1.119; acc: 0.574; precision: 0.564, recall: 0.652, macrof1: 0.572, weightedf1: 0.572\u001b[0m\n",
      "\u001b[92m2020-09-10 17:33:35\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=4.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:33:35\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_17:33:35 -- Epoch: 12/20; Train; loss: 0.275; acc: 0.916; precision: 0.894, recall: 0.944, macrof1: 0.916, weightedf1: 0.916\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:33:43\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_17:33:43 -- Epoch: 12/20; Valid; loss: 1.100; acc: 0.579; precision: 0.569, recall: 0.654, macrof1: 0.577, weightedf1: 0.577\u001b[0m\n",
      "\u001b[92m2020-09-10 17:33:43\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=4.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:33:43\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_17:33:43 -- Epoch: 13/20; Train; loss: 0.245; acc: 0.928; precision: 0.908, recall: 0.952, macrof1: 0.928, weightedf1: 0.928\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:33:52\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_17:33:52 -- Epoch: 13/20; Valid; loss: 1.086; acc: 0.584; precision: 0.573, recall: 0.658, macrof1: 0.581, weightedf1: 0.581\u001b[0m\n",
      "\u001b[92m2020-09-10 17:33:52\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=4.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:33:52\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_17:33:52 -- Epoch: 14/20; Train; loss: 0.212; acc: 0.940; precision: 0.923, recall: 0.960, macrof1: 0.940, weightedf1: 0.940\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:34:01\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_17:34:01 -- Epoch: 14/20; Valid; loss: 1.074; acc: 0.587; precision: 0.576, recall: 0.664, macrof1: 0.585, weightedf1: 0.585\u001b[0m\n",
      "\u001b[92m2020-09-10 17:34:01\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=4.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:34:01\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_17:34:01 -- Epoch: 15/20; Train; loss: 0.191; acc: 0.960; precision: 0.939, recall: 0.984, macrof1: 0.960, weightedf1: 0.960\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:34:09\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_17:34:09 -- Epoch: 15/20; Valid; loss: 1.067; acc: 0.592; precision: 0.579, recall: 0.670, macrof1: 0.589, weightedf1: 0.589\u001b[0m\n",
      "\u001b[92m2020-09-10 17:34:09\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=4.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:34:10\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_17:34:10 -- Epoch: 16/20; Train; loss: 0.171; acc: 0.968; precision: 0.953, recall: 0.984, macrof1: 0.968, weightedf1: 0.968\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:34:18\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_17:34:18 -- Epoch: 16/20; Valid; loss: 1.060; acc: 0.596; precision: 0.583, recall: 0.674, macrof1: 0.593, weightedf1: 0.593\u001b[0m\n",
      "\u001b[92m2020-09-10 17:34:18\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=4.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:34:18\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_17:34:18 -- Epoch: 17/20; Train; loss: 0.152; acc: 0.972; precision: 0.961, recall: 0.984, macrof1: 0.972, weightedf1: 0.972\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:34:27\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_17:34:27 -- Epoch: 17/20; Valid; loss: 1.055; acc: 0.598; precision: 0.585, recall: 0.674, macrof1: 0.596, weightedf1: 0.596\u001b[0m\n",
      "\u001b[92m2020-09-10 17:34:27\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=4.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:34:27\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_17:34:27 -- Epoch: 18/20; Train; loss: 0.141; acc: 0.976; precision: 0.961, recall: 0.992, macrof1: 0.976, weightedf1: 0.976\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:34:36\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_17:34:36 -- Epoch: 18/20; Valid; loss: 1.051; acc: 0.601; precision: 0.588, recall: 0.674, macrof1: 0.599, weightedf1: 0.599\u001b[0m\n",
      "\u001b[92m2020-09-10 17:34:36\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=4.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:34:36\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_17:34:36 -- Epoch: 19/20; Train; loss: 0.129; acc: 0.980; precision: 0.969, recall: 0.992, macrof1: 0.980, weightedf1: 0.980\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:34:44\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_17:34:44 -- Epoch: 19/20; Valid; loss: 1.047; acc: 0.604; precision: 0.591, recall: 0.676, macrof1: 0.602, weightedf1: 0.602\u001b[0m\n",
      "\u001b[92m2020-09-10 17:34:44\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=4.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:34:45\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_17:34:45 -- Epoch: 20/20; Train; loss: 0.116; acc: 0.988; precision: 0.984, recall: 0.992, macrof1: 0.988, weightedf1: 0.988\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:34:54\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_17:34:54 -- Epoch: 20/20; Valid; loss: 1.046; acc: 0.607; precision: 0.594, recall: 0.677, macrof1: 0.605, weightedf1: 0.605\u001b[0m\n",
      "\u001b[92m2020-09-10 17:34:54\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n",
      "\n",
      "\u001b[92m2020-09-10 17:34:54\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model with least valid loss (checkpoint: 20) at ./models/wikigaz_en_ft_ocr_lstm_v001_n250/wikigaz_en_ft_ocr_lstm_v001_n250.model\u001b[0m\n",
      "\n",
      "\n",
      "\n",
      "====================\n",
      "User time: 180.7035\n",
      "====================\n"
     ]
    }
   ],
   "source": [
    "from DeezyMatch import finetune as dm_finetune\n",
    "\n",
    "n_ft_examples = 250\n",
    "\n",
    "# fine-tune a pretrained model stored at pretrained_model_path and pretrained_vocab_path \n",
    "dm_finetune(input_file_path=\"./inputs/input_dfm_lstm_model_A.yaml\", \n",
    "            dataset_path=\"./dataset/ocr_trainval.txt\", \n",
    "            model_name=f\"wikigaz_en_ft_ocr_lstm_v001_n{n_ft_examples}\",\n",
    "            pretrained_model_path=\"./models/wikigaz_en_lstm_001/wikigaz_en_lstm_001.model\", \n",
    "            pretrained_vocab_path=\"./models/wikigaz_en_lstm_001/wikigaz_en_lstm_001.vocab\",\n",
    "            n_train_examples=n_ft_examples\n",
    "           )"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 26,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:34:54\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mread input file: ./inputs/input_dfm_lstm_model_A.yaml\u001b[0m\n",
      "\u001b[92m2020-09-10 17:34:54\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32mpytorch will use: cuda:1\u001b[0m\n",
      "\u001b[92m2020-09-10 17:34:54\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mread CSV file: ./dataset/ocr_trainval.txt\u001b[0m\n",
      "\u001b[92m2020-09-10 17:34:54\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32mnumber of labels, True: 42301 and False: 42302\u001b[0m\n",
      "\u001b[92m2020-09-10 17:34:54\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mSplitting the Dataset\u001b[0m\n",
      "\u001b[92m2020-09-10 17:34:54\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mfinish splitting the Dataset. User time: 0.04954886436462402\u001b[0m\n",
      "\u001b[92m2020-09-10 17:34:54\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msplits are as follow:\n",
      "not_assigned    58721\n",
      "val             25380\n",
      "train             500\n",
      "test                2\n",
      "Name: split, dtype: int64\u001b[0m\n",
      "\u001b[92m2020-09-10 17:34:54\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mstart creating a lookup table and convert characters to indices\u001b[0m\n",
      "\u001b[92m2020-09-10 17:34:54\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32m-- create vocabulary\u001b[0m\n",
      "\u001b[92m2020-09-10 17:34:55\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32m-- convert tokens to indices\u001b[0m\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "s1 padding:   0%|          | 0/25380 [00:00<?, ?it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:34:56\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mskipping 0 lines\u001b[0m\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "                                                                     \r"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n",
      "============================================================\n",
      "List all parameters in the model\n",
      "============================================================\n",
      "emb.weight False\n",
      "rnn_1.weight_ih_l0 False\n",
      "rnn_1.weight_hh_l0 False\n",
      "rnn_1.bias_ih_l0 False\n",
      "rnn_1.bias_hh_l0 False\n",
      "rnn_1.weight_ih_l0_reverse False\n",
      "rnn_1.weight_hh_l0_reverse False\n",
      "rnn_1.bias_ih_l0_reverse False\n",
      "rnn_1.bias_hh_l0_reverse False\n",
      "rnn_1.weight_ih_l1 False\n",
      "rnn_1.weight_hh_l1 False\n",
      "rnn_1.bias_ih_l1 False\n",
      "rnn_1.bias_hh_l1 False\n",
      "rnn_1.weight_ih_l1_reverse False\n",
      "rnn_1.weight_hh_l1_reverse False\n",
      "rnn_1.bias_ih_l1_reverse False\n",
      "rnn_1.bias_hh_l1_reverse False\n",
      "attn_step1.weight False\n",
      "attn_step1.bias False\n",
      "attn_step2.weight False\n",
      "attn_step2.bias False\n",
      "fc1.weight True\n",
      "fc1.bias True\n",
      "fc2.weight True\n",
      "fc2.bias True\n",
      "============================================================\n",
      "\n",
      "\n",
      "\n",
      "\u001b[92m2020-09-10 17:34:56\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m******************************\u001b[0m\n",
      "\u001b[92m2020-09-10 17:34:56\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m**** (Bi-directional) LSTM ****\u001b[0m\n",
      "\u001b[92m2020-09-10 17:34:56\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m******************************\u001b[0m\n",
      "\u001b[92m2020-09-10 17:34:56\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mNumber of batches: 8\u001b[0m\n",
      "\u001b[92m2020-09-10 17:34:56\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mNumber of epochs: 20\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "4afe10660f304e6cacfeee917f512081",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=20.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=8.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n",
      "\n",
      "====================\n",
      "Total number of params: 721323\n",
      "\n",
      "two_parallel_rnns (\n",
      "  (emb): Embedding(7542, 60), weights=((7542, 60),), parameters=452520\n",
      "  (rnn_1): LSTM(60, 60, num_layers=2, dropout=0.01, bidirectional=True), weights=((240, 60), (240, 60), (240,), (240,), (240, 60), (240, 60), (240,), (240,), (240, 120), (240, 60), (240,), (240,), (240, 120), (240, 60), (240,), (240,)), parameters=145920\n",
      "  (attn_step1): Linear(in_features=120, out_features=60, bias=True), weights=((60, 120), (60,)), parameters=7260\n",
      "  (attn_step2): Linear(in_features=60, out_features=1, bias=True), weights=((1, 60), (1,)), parameters=61\n",
      "  (fc1): Linear(in_features=960, out_features=120, bias=True), weights=((120, 960), (120,)), parameters=115320\n",
      "  (fc2): Linear(in_features=120, out_features=2, bias=True), weights=((2, 120), (2,)), parameters=242\n",
      ")\n",
      "====================\n",
      "\n",
      "\n",
      "\u001b[92m2020-09-10 17:34:57\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_17:34:57 -- Epoch: 1/20; Train; loss: 1.562; acc: 0.460; precision: 0.458, recall: 0.440, macrof1: 0.460, weightedf1: 0.460\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:35:05\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_17:35:05 -- Epoch: 1/20; Valid; loss: 1.557; acc: 0.492; precision: 0.493, recall: 0.543, macrof1: 0.491, weightedf1: 0.491\u001b[0m\n",
      "\u001b[92m2020-09-10 17:35:05\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=8.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:35:06\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_17:35:06 -- Epoch: 2/20; Train; loss: 1.176; acc: 0.522; precision: 0.522, recall: 0.524, macrof1: 0.522, weightedf1: 0.522\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:35:14\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_17:35:14 -- Epoch: 2/20; Valid; loss: 1.362; acc: 0.522; precision: 0.520, recall: 0.563, macrof1: 0.521, weightedf1: 0.521\u001b[0m\n",
      "\u001b[92m2020-09-10 17:35:14\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=8.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:35:14\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_17:35:14 -- Epoch: 3/20; Train; loss: 0.929; acc: 0.606; precision: 0.602, recall: 0.628, macrof1: 0.606, weightedf1: 0.606\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:35:23\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_17:35:23 -- Epoch: 3/20; Valid; loss: 1.222; acc: 0.548; precision: 0.542, recall: 0.615, macrof1: 0.546, weightedf1: 0.546\u001b[0m\n",
      "\u001b[92m2020-09-10 17:35:23\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=8.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:35:23\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_17:35:23 -- Epoch: 4/20; Train; loss: 0.752; acc: 0.660; precision: 0.652, recall: 0.688, macrof1: 0.660, weightedf1: 0.660\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:35:32\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_17:35:32 -- Epoch: 4/20; Valid; loss: 1.114; acc: 0.567; precision: 0.560, recall: 0.629, macrof1: 0.565, weightedf1: 0.565\u001b[0m\n",
      "\u001b[92m2020-09-10 17:35:32\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=8.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:35:32\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_17:35:32 -- Epoch: 5/20; Train; loss: 0.628; acc: 0.708; precision: 0.695, recall: 0.740, macrof1: 0.708, weightedf1: 0.708\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:35:41\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_17:35:41 -- Epoch: 5/20; Valid; loss: 1.036; acc: 0.583; precision: 0.574, recall: 0.645, macrof1: 0.582, weightedf1: 0.582\u001b[0m\n",
      "\u001b[92m2020-09-10 17:35:41\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=8.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:35:41\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_17:35:41 -- Epoch: 6/20; Train; loss: 0.542; acc: 0.760; precision: 0.748, recall: 0.784, macrof1: 0.760, weightedf1: 0.760\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:35:50\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_17:35:50 -- Epoch: 6/20; Valid; loss: 0.978; acc: 0.596; precision: 0.587, recall: 0.653, macrof1: 0.595, weightedf1: 0.595\u001b[0m\n",
      "\u001b[92m2020-09-10 17:35:50\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=8.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:35:50\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_17:35:50 -- Epoch: 7/20; Train; loss: 0.471; acc: 0.788; precision: 0.775, recall: 0.812, macrof1: 0.788, weightedf1: 0.788\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:35:58\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_17:35:58 -- Epoch: 7/20; Valid; loss: 0.935; acc: 0.609; precision: 0.598, recall: 0.666, macrof1: 0.608, weightedf1: 0.608\u001b[0m\n",
      "\u001b[92m2020-09-10 17:35:58\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=8.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:35:59\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_17:35:59 -- Epoch: 8/20; Train; loss: 0.413; acc: 0.830; precision: 0.821, recall: 0.844, macrof1: 0.830, weightedf1: 0.830\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:36:07\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_17:36:07 -- Epoch: 8/20; Valid; loss: 0.902; acc: 0.620; precision: 0.608, recall: 0.674, macrof1: 0.619, weightedf1: 0.619\u001b[0m\n",
      "\u001b[92m2020-09-10 17:36:07\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=8.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:36:08\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_17:36:08 -- Epoch: 9/20; Train; loss: 0.361; acc: 0.858; precision: 0.843, recall: 0.880, macrof1: 0.858, weightedf1: 0.858\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:36:16\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_17:36:16 -- Epoch: 9/20; Valid; loss: 0.879; acc: 0.629; precision: 0.616, recall: 0.687, macrof1: 0.628, weightedf1: 0.628\u001b[0m\n",
      "\u001b[92m2020-09-10 17:36:16\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=8.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:36:16\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_17:36:16 -- Epoch: 10/20; Train; loss: 0.323; acc: 0.884; precision: 0.861, recall: 0.916, macrof1: 0.884, weightedf1: 0.884\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:36:25\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_17:36:25 -- Epoch: 10/20; Valid; loss: 0.858; acc: 0.636; precision: 0.623, recall: 0.689, macrof1: 0.635, weightedf1: 0.635\u001b[0m\n",
      "\u001b[92m2020-09-10 17:36:25\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=8.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:36:25\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_17:36:25 -- Epoch: 11/20; Train; loss: 0.285; acc: 0.904; precision: 0.888, recall: 0.924, macrof1: 0.904, weightedf1: 0.904\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:36:34\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_17:36:34 -- Epoch: 11/20; Valid; loss: 0.843; acc: 0.644; precision: 0.631, recall: 0.693, macrof1: 0.643, weightedf1: 0.643\u001b[0m\n",
      "\u001b[92m2020-09-10 17:36:34\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=8.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:36:34\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_17:36:34 -- Epoch: 12/20; Train; loss: 0.254; acc: 0.922; precision: 0.904, recall: 0.944, macrof1: 0.922, weightedf1: 0.922\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:36:43\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_17:36:43 -- Epoch: 12/20; Valid; loss: 0.832; acc: 0.652; precision: 0.639, recall: 0.696, macrof1: 0.651, weightedf1: 0.651\u001b[0m\n",
      "\u001b[92m2020-09-10 17:36:43\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=8.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:36:43\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_17:36:43 -- Epoch: 13/20; Train; loss: 0.230; acc: 0.936; precision: 0.926, recall: 0.948, macrof1: 0.936, weightedf1: 0.936\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:36:51\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_17:36:51 -- Epoch: 13/20; Valid; loss: 0.820; acc: 0.658; precision: 0.647, recall: 0.697, macrof1: 0.658, weightedf1: 0.658\u001b[0m\n",
      "\u001b[92m2020-09-10 17:36:51\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=8.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:36:52\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_17:36:52 -- Epoch: 14/20; Train; loss: 0.204; acc: 0.946; precision: 0.937, recall: 0.956, macrof1: 0.946, weightedf1: 0.946\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:37:00\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_17:37:00 -- Epoch: 14/20; Valid; loss: 0.811; acc: 0.665; precision: 0.653, recall: 0.703, macrof1: 0.664, weightedf1: 0.664\u001b[0m\n",
      "\u001b[92m2020-09-10 17:37:00\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=8.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:37:01\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_17:37:01 -- Epoch: 15/20; Train; loss: 0.184; acc: 0.958; precision: 0.953, recall: 0.964, macrof1: 0.958, weightedf1: 0.958\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:37:09\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_17:37:09 -- Epoch: 15/20; Valid; loss: 0.803; acc: 0.672; precision: 0.659, recall: 0.710, macrof1: 0.671, weightedf1: 0.671\u001b[0m\n",
      "\u001b[92m2020-09-10 17:37:09\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=8.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:37:09\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_17:37:09 -- Epoch: 16/20; Train; loss: 0.163; acc: 0.972; precision: 0.961, recall: 0.984, macrof1: 0.972, weightedf1: 0.972\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:37:18\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_17:37:18 -- Epoch: 16/20; Valid; loss: 0.796; acc: 0.676; precision: 0.664, recall: 0.714, macrof1: 0.676, weightedf1: 0.676\u001b[0m\n",
      "\u001b[92m2020-09-10 17:37:18\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=8.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:37:18\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_17:37:18 -- Epoch: 17/20; Train; loss: 0.146; acc: 0.980; precision: 0.976, recall: 0.984, macrof1: 0.980, weightedf1: 0.980\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:37:27\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_17:37:27 -- Epoch: 17/20; Valid; loss: 0.789; acc: 0.681; precision: 0.670, recall: 0.712, macrof1: 0.681, weightedf1: 0.681\u001b[0m\n",
      "\u001b[92m2020-09-10 17:37:27\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=8.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:37:27\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_17:37:27 -- Epoch: 18/20; Train; loss: 0.130; acc: 0.990; precision: 0.988, recall: 0.992, macrof1: 0.990, weightedf1: 0.990\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:37:36\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_17:37:36 -- Epoch: 18/20; Valid; loss: 0.786; acc: 0.685; precision: 0.673, recall: 0.719, macrof1: 0.685, weightedf1: 0.685\u001b[0m\n",
      "\u001b[92m2020-09-10 17:37:36\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=8.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:37:36\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_17:37:36 -- Epoch: 19/20; Train; loss: 0.116; acc: 0.996; precision: 0.992, recall: 1.000, macrof1: 0.996, weightedf1: 0.996\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:37:45\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_17:37:45 -- Epoch: 19/20; Valid; loss: 0.781; acc: 0.690; precision: 0.680, recall: 0.716, macrof1: 0.690, weightedf1: 0.690\u001b[0m\n",
      "\u001b[92m2020-09-10 17:37:45\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=8.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:37:45\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_17:37:45 -- Epoch: 20/20; Train; loss: 0.104; acc: 0.996; precision: 0.992, recall: 1.000, macrof1: 0.996, weightedf1: 0.996\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:37:54\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_17:37:54 -- Epoch: 20/20; Valid; loss: 0.780; acc: 0.693; precision: 0.682, recall: 0.724, macrof1: 0.693, weightedf1: 0.693\u001b[0m\n",
      "\u001b[92m2020-09-10 17:37:54\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n",
      "\n",
      "\u001b[92m2020-09-10 17:37:54\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model with least valid loss (checkpoint: 20) at ./models/wikigaz_en_ft_ocr_lstm_v001_n500/wikigaz_en_ft_ocr_lstm_v001_n500.model\u001b[0m\n",
      "\n",
      "\n",
      "\n",
      "====================\n",
      "User time: 177.3186\n",
      "====================\n"
     ]
    }
   ],
   "source": [
    "from DeezyMatch import finetune as dm_finetune\n",
    "\n",
    "n_ft_examples = 500\n",
    "\n",
    "# fine-tune a pretrained model stored at pretrained_model_path and pretrained_vocab_path \n",
    "dm_finetune(input_file_path=\"./inputs/input_dfm_lstm_model_A.yaml\", \n",
    "            dataset_path=\"./dataset/ocr_trainval.txt\", \n",
    "            model_name=f\"wikigaz_en_ft_ocr_lstm_v001_n{n_ft_examples}\",\n",
    "            pretrained_model_path=\"./models/wikigaz_en_lstm_001/wikigaz_en_lstm_001.model\", \n",
    "            pretrained_vocab_path=\"./models/wikigaz_en_lstm_001/wikigaz_en_lstm_001.vocab\",\n",
    "            n_train_examples=n_ft_examples\n",
    "           )"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 27,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:37:54\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mread input file: ./inputs/input_dfm_lstm_model_A.yaml\u001b[0m\n",
      "\u001b[92m2020-09-10 17:37:54\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32mpytorch will use: cuda:1\u001b[0m\n",
      "\u001b[92m2020-09-10 17:37:54\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mread CSV file: ./dataset/ocr_trainval.txt\u001b[0m\n",
      "\u001b[92m2020-09-10 17:37:54\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32mnumber of labels, True: 42301 and False: 42302\u001b[0m\n",
      "\u001b[92m2020-09-10 17:37:54\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mSplitting the Dataset\u001b[0m\n",
      "\u001b[92m2020-09-10 17:37:54\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mfinish splitting the Dataset. User time: 0.054633378982543945\u001b[0m\n",
      "\u001b[92m2020-09-10 17:37:54\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msplits are as follow:\n",
      "not_assigned    58221\n",
      "val             25380\n",
      "train            1000\n",
      "test                2\n",
      "Name: split, dtype: int64\u001b[0m\n",
      "\u001b[92m2020-09-10 17:37:54\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mstart creating a lookup table and convert characters to indices\u001b[0m\n",
      "\u001b[92m2020-09-10 17:37:55\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32m-- create vocabulary\u001b[0m\n",
      "\u001b[92m2020-09-10 17:37:56\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32m-- convert tokens to indices\u001b[0m\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "s1 padding:   0%|          | 0/25380 [00:00<?, ?it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:37:56\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mskipping 0 lines\u001b[0m\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "                                                                     \r"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n",
      "============================================================\n",
      "List all parameters in the model\n",
      "============================================================\n",
      "emb.weight False\n",
      "rnn_1.weight_ih_l0 False\n",
      "rnn_1.weight_hh_l0 False\n",
      "rnn_1.bias_ih_l0 False\n",
      "rnn_1.bias_hh_l0 False\n",
      "rnn_1.weight_ih_l0_reverse False\n",
      "rnn_1.weight_hh_l0_reverse False\n",
      "rnn_1.bias_ih_l0_reverse False\n",
      "rnn_1.bias_hh_l0_reverse False\n",
      "rnn_1.weight_ih_l1 False\n",
      "rnn_1.weight_hh_l1 False\n",
      "rnn_1.bias_ih_l1 False\n",
      "rnn_1.bias_hh_l1 False\n",
      "rnn_1.weight_ih_l1_reverse False\n",
      "rnn_1.weight_hh_l1_reverse False\n",
      "rnn_1.bias_ih_l1_reverse False\n",
      "rnn_1.bias_hh_l1_reverse False\n",
      "attn_step1.weight False\n",
      "attn_step1.bias False\n",
      "attn_step2.weight False\n",
      "attn_step2.bias False\n",
      "fc1.weight True\n",
      "fc1.bias True\n",
      "fc2.weight True\n",
      "fc2.bias True\n",
      "============================================================\n",
      "\n",
      "\n",
      "\n",
      "\u001b[92m2020-09-10 17:37:57\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m******************************\u001b[0m\n",
      "\u001b[92m2020-09-10 17:37:57\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m**** (Bi-directional) LSTM ****\u001b[0m\n",
      "\u001b[92m2020-09-10 17:37:57\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m******************************\u001b[0m\n",
      "\u001b[92m2020-09-10 17:37:57\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mNumber of batches: 16\u001b[0m\n",
      "\u001b[92m2020-09-10 17:37:57\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mNumber of epochs: 20\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "2631fb22977a49da912b3e49361a96b6",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=20.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=16.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n",
      "\n",
      "====================\n",
      "Total number of params: 721323\n",
      "\n",
      "two_parallel_rnns (\n",
      "  (emb): Embedding(7542, 60), weights=((7542, 60),), parameters=452520\n",
      "  (rnn_1): LSTM(60, 60, num_layers=2, dropout=0.01, bidirectional=True), weights=((240, 60), (240, 60), (240,), (240,), (240, 60), (240, 60), (240,), (240,), (240, 120), (240, 60), (240,), (240,), (240, 120), (240, 60), (240,), (240,)), parameters=145920\n",
      "  (attn_step1): Linear(in_features=120, out_features=60, bias=True), weights=((60, 120), (60,)), parameters=7260\n",
      "  (attn_step2): Linear(in_features=60, out_features=1, bias=True), weights=((1, 60), (1,)), parameters=61\n",
      "  (fc1): Linear(in_features=960, out_features=120, bias=True), weights=((120, 960), (120,)), parameters=115320\n",
      "  (fc2): Linear(in_features=120, out_features=2, bias=True), weights=((2, 120), (2,)), parameters=242\n",
      ")\n",
      "====================\n",
      "\n",
      "\n",
      "\u001b[92m2020-09-10 17:37:57\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_17:37:57 -- Epoch: 1/20; Train; loss: 1.514; acc: 0.493; precision: 0.493, recall: 0.514, macrof1: 0.493, weightedf1: 0.493\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:38:06\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_17:38:06 -- Epoch: 1/20; Valid; loss: 1.341; acc: 0.520; precision: 0.519, recall: 0.559, macrof1: 0.520, weightedf1: 0.520\u001b[0m\n",
      "\u001b[92m2020-09-10 17:38:06\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=16.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:38:07\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_17:38:07 -- Epoch: 2/20; Train; loss: 1.017; acc: 0.595; precision: 0.592, recall: 0.614, macrof1: 0.595, weightedf1: 0.595\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:38:15\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_17:38:15 -- Epoch: 2/20; Valid; loss: 1.063; acc: 0.570; precision: 0.567, recall: 0.590, macrof1: 0.570, weightedf1: 0.570\u001b[0m\n",
      "\u001b[92m2020-09-10 17:38:15\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=16.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:38:16\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_17:38:16 -- Epoch: 3/20; Train; loss: 0.759; acc: 0.654; precision: 0.650, recall: 0.666, macrof1: 0.654, weightedf1: 0.654\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:38:24\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_17:38:24 -- Epoch: 3/20; Valid; loss: 0.906; acc: 0.607; precision: 0.601, recall: 0.634, macrof1: 0.607, weightedf1: 0.607\u001b[0m\n",
      "\u001b[92m2020-09-10 17:38:24\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=16.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:38:25\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_17:38:25 -- Epoch: 4/20; Train; loss: 0.599; acc: 0.716; precision: 0.705, recall: 0.744, macrof1: 0.716, weightedf1: 0.716\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:38:33\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_17:38:33 -- Epoch: 4/20; Valid; loss: 0.816; acc: 0.634; precision: 0.624, recall: 0.674, macrof1: 0.633, weightedf1: 0.633\u001b[0m\n",
      "\u001b[92m2020-09-10 17:38:33\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=16.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:38:34\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_17:38:34 -- Epoch: 5/20; Train; loss: 0.495; acc: 0.761; precision: 0.753, recall: 0.776, macrof1: 0.761, weightedf1: 0.761\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:38:43\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_17:38:43 -- Epoch: 5/20; Valid; loss: 0.752; acc: 0.659; precision: 0.655, recall: 0.673, macrof1: 0.659, weightedf1: 0.659\u001b[0m\n",
      "\u001b[92m2020-09-10 17:38:43\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=16.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:38:43\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_17:38:43 -- Epoch: 6/20; Train; loss: 0.422; acc: 0.812; precision: 0.811, recall: 0.814, macrof1: 0.812, weightedf1: 0.812\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:38:52\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_17:38:52 -- Epoch: 6/20; Valid; loss: 0.705; acc: 0.678; precision: 0.674, recall: 0.689, macrof1: 0.678, weightedf1: 0.678\u001b[0m\n",
      "\u001b[92m2020-09-10 17:38:52\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=16.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:38:52\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_17:38:52 -- Epoch: 7/20; Train; loss: 0.365; acc: 0.852; precision: 0.851, recall: 0.854, macrof1: 0.852, weightedf1: 0.852\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:39:01\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_17:39:01 -- Epoch: 7/20; Valid; loss: 0.668; acc: 0.696; precision: 0.691, recall: 0.708, macrof1: 0.696, weightedf1: 0.696\u001b[0m\n",
      "\u001b[92m2020-09-10 17:39:01\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=16.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:39:01\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_17:39:01 -- Epoch: 8/20; Train; loss: 0.316; acc: 0.883; precision: 0.878, recall: 0.890, macrof1: 0.883, weightedf1: 0.883\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:39:10\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_17:39:10 -- Epoch: 8/20; Valid; loss: 0.645; acc: 0.710; precision: 0.709, recall: 0.713, macrof1: 0.710, weightedf1: 0.710\u001b[0m\n",
      "\u001b[92m2020-09-10 17:39:10\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=16.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:39:10\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_17:39:10 -- Epoch: 9/20; Train; loss: 0.271; acc: 0.912; precision: 0.902, recall: 0.924, macrof1: 0.912, weightedf1: 0.912\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:39:19\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_17:39:19 -- Epoch: 9/20; Valid; loss: 0.627; acc: 0.722; precision: 0.718, recall: 0.733, macrof1: 0.722, weightedf1: 0.722\u001b[0m\n",
      "\u001b[92m2020-09-10 17:39:19\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=16.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:39:20\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_17:39:20 -- Epoch: 10/20; Train; loss: 0.237; acc: 0.928; precision: 0.928, recall: 0.928, macrof1: 0.928, weightedf1: 0.928\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:39:28\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_17:39:28 -- Epoch: 10/20; Valid; loss: 0.615; acc: 0.733; precision: 0.733, recall: 0.731, macrof1: 0.733, weightedf1: 0.733\u001b[0m\n",
      "\u001b[92m2020-09-10 17:39:28\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=16.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:39:29\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_17:39:29 -- Epoch: 11/20; Train; loss: 0.213; acc: 0.944; precision: 0.934, recall: 0.956, macrof1: 0.944, weightedf1: 0.944\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:39:38\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_17:39:38 -- Epoch: 11/20; Valid; loss: 0.605; acc: 0.740; precision: 0.737, recall: 0.748, macrof1: 0.740, weightedf1: 0.740\u001b[0m\n",
      "\u001b[92m2020-09-10 17:39:38\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=16.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:39:38\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_17:39:38 -- Epoch: 12/20; Train; loss: 0.185; acc: 0.957; precision: 0.942, recall: 0.974, macrof1: 0.957, weightedf1: 0.957\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:39:47\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_17:39:47 -- Epoch: 12/20; Valid; loss: 0.598; acc: 0.748; precision: 0.745, recall: 0.754, macrof1: 0.748, weightedf1: 0.748\u001b[0m\n",
      "\u001b[92m2020-09-10 17:39:47\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=16.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:39:47\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_17:39:47 -- Epoch: 13/20; Train; loss: 0.166; acc: 0.965; precision: 0.959, recall: 0.972, macrof1: 0.965, weightedf1: 0.965\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:39:56\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_17:39:56 -- Epoch: 13/20; Valid; loss: 0.592; acc: 0.753; precision: 0.754, recall: 0.752, macrof1: 0.753, weightedf1: 0.753\u001b[0m\n",
      "\u001b[92m2020-09-10 17:39:56\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=16.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:39:56\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_17:39:56 -- Epoch: 14/20; Train; loss: 0.146; acc: 0.970; precision: 0.959, recall: 0.982, macrof1: 0.970, weightedf1: 0.970\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:40:05\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_17:40:05 -- Epoch: 14/20; Valid; loss: 0.591; acc: 0.758; precision: 0.756, recall: 0.762, macrof1: 0.758, weightedf1: 0.758\u001b[0m\n",
      "\u001b[92m2020-09-10 17:40:05\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=16.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:40:05\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_17:40:05 -- Epoch: 15/20; Train; loss: 0.131; acc: 0.975; precision: 0.967, recall: 0.984, macrof1: 0.975, weightedf1: 0.975\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:40:14\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_17:40:14 -- Epoch: 15/20; Valid; loss: 0.589; acc: 0.761; precision: 0.764, recall: 0.756, macrof1: 0.761, weightedf1: 0.761\u001b[0m\n",
      "\u001b[92m2020-09-10 17:40:14\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=16.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:40:14\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_17:40:14 -- Epoch: 16/20; Train; loss: 0.115; acc: 0.983; precision: 0.974, recall: 0.992, macrof1: 0.983, weightedf1: 0.983\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:40:23\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_17:40:23 -- Epoch: 16/20; Valid; loss: 0.588; acc: 0.764; precision: 0.761, recall: 0.769, macrof1: 0.764, weightedf1: 0.764\u001b[0m\n",
      "\u001b[92m2020-09-10 17:40:23\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=16.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:40:24\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_17:40:24 -- Epoch: 17/20; Train; loss: 0.103; acc: 0.989; precision: 0.982, recall: 0.996, macrof1: 0.989, weightedf1: 0.989\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:40:32\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_17:40:32 -- Epoch: 17/20; Valid; loss: 0.591; acc: 0.766; precision: 0.768, recall: 0.762, macrof1: 0.766, weightedf1: 0.766\u001b[0m\n",
      "\u001b[92m2020-09-10 17:40:32\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model (early stopped) with least valid loss (checkpoint: 16) at ./models/wikigaz_en_ft_ocr_lstm_v001_n1000/wikigaz_en_ft_ocr_lstm_v001_n1000.model\u001b[0m\n",
      "\u001b[92m2020-09-10 17:40:32\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n",
      "\u001b[92m2020-09-10 17:40:32\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mEarly stopping at epoch: 17, selected epoch: 16\u001b[0m\n",
      "\n",
      "\n",
      "\n",
      "\n",
      "====================\n",
      "User time: 155.5745\n",
      "====================\n"
     ]
    }
   ],
   "source": [
    "from DeezyMatch import finetune as dm_finetune\n",
    "\n",
    "n_ft_examples = 1000\n",
    "\n",
    "# fine-tune a pretrained model stored at pretrained_model_path and pretrained_vocab_path \n",
    "dm_finetune(input_file_path=\"./inputs/input_dfm_lstm_model_A.yaml\", \n",
    "            dataset_path=\"./dataset/ocr_trainval.txt\", \n",
    "            model_name=f\"wikigaz_en_ft_ocr_lstm_v001_n{n_ft_examples}\",\n",
    "            pretrained_model_path=\"./models/wikigaz_en_lstm_001/wikigaz_en_lstm_001.model\", \n",
    "            pretrained_vocab_path=\"./models/wikigaz_en_lstm_001/wikigaz_en_lstm_001.vocab\",\n",
    "            n_train_examples=n_ft_examples\n",
    "           )"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 28,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:40:32\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mread input file: ./inputs/input_dfm_lstm_model_A.yaml\u001b[0m\n",
      "\u001b[92m2020-09-10 17:40:32\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32mpytorch will use: cuda:1\u001b[0m\n",
      "\u001b[92m2020-09-10 17:40:32\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mread CSV file: ./dataset/ocr_trainval.txt\u001b[0m\n",
      "\u001b[92m2020-09-10 17:40:33\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32mnumber of labels, True: 42301 and False: 42302\u001b[0m\n",
      "\u001b[92m2020-09-10 17:40:33\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mSplitting the Dataset\u001b[0m\n",
      "\u001b[92m2020-09-10 17:40:33\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mfinish splitting the Dataset. User time: 0.05253171920776367\u001b[0m\n",
      "\u001b[92m2020-09-10 17:40:33\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msplits are as follow:\n",
      "not_assigned    57221\n",
      "val             25380\n",
      "train            2000\n",
      "test                2\n",
      "Name: split, dtype: int64\u001b[0m\n",
      "\u001b[92m2020-09-10 17:40:33\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mstart creating a lookup table and convert characters to indices\u001b[0m\n",
      "\u001b[92m2020-09-10 17:40:33\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32m-- create vocabulary\u001b[0m\n",
      "\u001b[92m2020-09-10 17:40:34\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32m-- convert tokens to indices\u001b[0m\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "s1 padding:   0%|          | 0/25380 [00:00<?, ?it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:40:35\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mskipping 0 lines\u001b[0m\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "                                                                     \r"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n",
      "============================================================\n",
      "List all parameters in the model\n",
      "============================================================\n",
      "emb.weight False\n",
      "rnn_1.weight_ih_l0 False\n",
      "rnn_1.weight_hh_l0 False\n",
      "rnn_1.bias_ih_l0 False\n",
      "rnn_1.bias_hh_l0 False\n",
      "rnn_1.weight_ih_l0_reverse False\n",
      "rnn_1.weight_hh_l0_reverse False\n",
      "rnn_1.bias_ih_l0_reverse False\n",
      "rnn_1.bias_hh_l0_reverse False\n",
      "rnn_1.weight_ih_l1 False\n",
      "rnn_1.weight_hh_l1 False\n",
      "rnn_1.bias_ih_l1 False\n",
      "rnn_1.bias_hh_l1 False\n",
      "rnn_1.weight_ih_l1_reverse False\n",
      "rnn_1.weight_hh_l1_reverse False\n",
      "rnn_1.bias_ih_l1_reverse False\n",
      "rnn_1.bias_hh_l1_reverse False\n",
      "attn_step1.weight False\n",
      "attn_step1.bias False\n",
      "attn_step2.weight False\n",
      "attn_step2.bias False\n",
      "fc1.weight True\n",
      "fc1.bias True\n",
      "fc2.weight True\n",
      "fc2.bias True\n",
      "============================================================\n",
      "\n",
      "\n",
      "\n",
      "\u001b[92m2020-09-10 17:40:35\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m******************************\u001b[0m\n",
      "\u001b[92m2020-09-10 17:40:35\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m**** (Bi-directional) LSTM ****\u001b[0m\n",
      "\u001b[92m2020-09-10 17:40:35\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m******************************\u001b[0m\n",
      "\u001b[92m2020-09-10 17:40:35\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mNumber of batches: 32\u001b[0m\n",
      "\u001b[92m2020-09-10 17:40:35\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mNumber of epochs: 20\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "d8fa7b1289c34b02924704845917319a",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=20.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=32.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n",
      "\n",
      "====================\n",
      "Total number of params: 721323\n",
      "\n",
      "two_parallel_rnns (\n",
      "  (emb): Embedding(7542, 60), weights=((7542, 60),), parameters=452520\n",
      "  (rnn_1): LSTM(60, 60, num_layers=2, dropout=0.01, bidirectional=True), weights=((240, 60), (240, 60), (240,), (240,), (240, 60), (240, 60), (240,), (240,), (240, 120), (240, 60), (240,), (240,), (240, 120), (240, 60), (240,), (240,)), parameters=145920\n",
      "  (attn_step1): Linear(in_features=120, out_features=60, bias=True), weights=((60, 120), (60,)), parameters=7260\n",
      "  (attn_step2): Linear(in_features=60, out_features=1, bias=True), weights=((1, 60), (1,)), parameters=61\n",
      "  (fc1): Linear(in_features=960, out_features=120, bias=True), weights=((120, 960), (120,)), parameters=115320\n",
      "  (fc2): Linear(in_features=120, out_features=2, bias=True), weights=((2, 120), (2,)), parameters=242\n",
      ")\n",
      "====================\n",
      "\n",
      "\n",
      "\u001b[92m2020-09-10 17:40:36\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_17:40:36 -- Epoch: 1/20; Train; loss: 1.323; acc: 0.522; precision: 0.522, recall: 0.528, macrof1: 0.522, weightedf1: 0.522\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:40:45\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_17:40:45 -- Epoch: 1/20; Valid; loss: 1.067; acc: 0.574; precision: 0.570, recall: 0.601, macrof1: 0.574, weightedf1: 0.574\u001b[0m\n",
      "\u001b[92m2020-09-10 17:40:45\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=32.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:40:46\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_17:40:46 -- Epoch: 2/20; Train; loss: 0.783; acc: 0.653; precision: 0.641, recall: 0.699, macrof1: 0.653, weightedf1: 0.653\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:40:54\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_17:40:54 -- Epoch: 2/20; Valid; loss: 0.792; acc: 0.641; precision: 0.633, recall: 0.667, macrof1: 0.640, weightedf1: 0.640\u001b[0m\n",
      "\u001b[92m2020-09-10 17:40:54\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=32.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:40:55\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_17:40:55 -- Epoch: 3/20; Train; loss: 0.561; acc: 0.731; precision: 0.726, recall: 0.743, macrof1: 0.731, weightedf1: 0.731\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:41:04\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_17:41:04 -- Epoch: 3/20; Valid; loss: 0.665; acc: 0.689; precision: 0.683, recall: 0.704, macrof1: 0.689, weightedf1: 0.689\u001b[0m\n",
      "\u001b[92m2020-09-10 17:41:04\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=32.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:41:05\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_17:41:05 -- Epoch: 4/20; Train; loss: 0.456; acc: 0.790; precision: 0.788, recall: 0.793, macrof1: 0.790, weightedf1: 0.790\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:41:13\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_17:41:13 -- Epoch: 4/20; Valid; loss: 0.593; acc: 0.726; precision: 0.715, recall: 0.750, macrof1: 0.725, weightedf1: 0.725\u001b[0m\n",
      "\u001b[92m2020-09-10 17:41:13\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=32.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:41:14\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_17:41:14 -- Epoch: 5/20; Train; loss: 0.385; acc: 0.845; precision: 0.840, recall: 0.852, macrof1: 0.845, weightedf1: 0.845\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:41:23\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_17:41:23 -- Epoch: 5/20; Valid; loss: 0.551; acc: 0.751; precision: 0.752, recall: 0.747, macrof1: 0.751, weightedf1: 0.751\u001b[0m\n",
      "\u001b[92m2020-09-10 17:41:23\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=32.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:41:24\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_17:41:24 -- Epoch: 6/20; Train; loss: 0.329; acc: 0.876; precision: 0.876, recall: 0.877, macrof1: 0.876, weightedf1: 0.876\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:41:32\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_17:41:32 -- Epoch: 6/20; Valid; loss: 0.518; acc: 0.768; precision: 0.764, recall: 0.776, macrof1: 0.768, weightedf1: 0.768\u001b[0m\n",
      "\u001b[92m2020-09-10 17:41:32\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=32.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:41:33\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_17:41:33 -- Epoch: 7/20; Train; loss: 0.286; acc: 0.897; precision: 0.889, recall: 0.907, macrof1: 0.897, weightedf1: 0.897\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:41:42\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_17:41:42 -- Epoch: 7/20; Valid; loss: 0.499; acc: 0.781; precision: 0.778, recall: 0.785, macrof1: 0.781, weightedf1: 0.781\u001b[0m\n",
      "\u001b[92m2020-09-10 17:41:42\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=32.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:41:43\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_17:41:43 -- Epoch: 8/20; Train; loss: 0.247; acc: 0.916; precision: 0.915, recall: 0.917, macrof1: 0.916, weightedf1: 0.916\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:41:52\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_17:41:52 -- Epoch: 8/20; Valid; loss: 0.486; acc: 0.789; precision: 0.782, recall: 0.803, macrof1: 0.789, weightedf1: 0.789\u001b[0m\n",
      "\u001b[92m2020-09-10 17:41:52\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=32.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:41:53\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_17:41:53 -- Epoch: 9/20; Train; loss: 0.216; acc: 0.933; precision: 0.930, recall: 0.938, macrof1: 0.933, weightedf1: 0.933\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:42:01\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_17:42:01 -- Epoch: 9/20; Valid; loss: 0.476; acc: 0.797; precision: 0.795, recall: 0.801, macrof1: 0.797, weightedf1: 0.797\u001b[0m\n",
      "\u001b[92m2020-09-10 17:42:01\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=32.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:42:02\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_17:42:02 -- Epoch: 10/20; Train; loss: 0.189; acc: 0.946; precision: 0.940, recall: 0.953, macrof1: 0.946, weightedf1: 0.946\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:42:11\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_17:42:11 -- Epoch: 10/20; Valid; loss: 0.468; acc: 0.804; precision: 0.810, recall: 0.794, macrof1: 0.804, weightedf1: 0.804\u001b[0m\n",
      "\u001b[92m2020-09-10 17:42:11\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=32.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:42:12\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_17:42:12 -- Epoch: 11/20; Train; loss: 0.164; acc: 0.958; precision: 0.956, recall: 0.959, macrof1: 0.957, weightedf1: 0.957\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:42:20\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_17:42:20 -- Epoch: 11/20; Valid; loss: 0.466; acc: 0.807; precision: 0.802, recall: 0.815, macrof1: 0.807, weightedf1: 0.807\u001b[0m\n",
      "\u001b[92m2020-09-10 17:42:20\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=32.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:42:21\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_17:42:21 -- Epoch: 12/20; Train; loss: 0.147; acc: 0.966; precision: 0.963, recall: 0.968, macrof1: 0.965, weightedf1: 0.965\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:42:30\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_17:42:30 -- Epoch: 12/20; Valid; loss: 0.462; acc: 0.811; precision: 0.808, recall: 0.815, macrof1: 0.811, weightedf1: 0.811\u001b[0m\n",
      "\u001b[92m2020-09-10 17:42:30\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=32.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:42:31\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_17:42:31 -- Epoch: 13/20; Train; loss: 0.123; acc: 0.978; precision: 0.977, recall: 0.979, macrof1: 0.978, weightedf1: 0.978\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:42:40\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_17:42:40 -- Epoch: 13/20; Valid; loss: 0.463; acc: 0.815; precision: 0.816, recall: 0.813, macrof1: 0.815, weightedf1: 0.815\u001b[0m\n",
      "\u001b[92m2020-09-10 17:42:40\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model (early stopped) with least valid loss (checkpoint: 12) at ./models/wikigaz_en_ft_ocr_lstm_v001_n2000/wikigaz_en_ft_ocr_lstm_v001_n2000.model\u001b[0m\n",
      "\u001b[92m2020-09-10 17:42:40\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n",
      "\u001b[92m2020-09-10 17:42:40\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mEarly stopping at epoch: 13, selected epoch: 12\u001b[0m\n",
      "\n",
      "\n",
      "\n",
      "\n",
      "====================\n",
      "User time: 124.7803\n",
      "====================\n"
     ]
    }
   ],
   "source": [
    "from DeezyMatch import finetune as dm_finetune\n",
    "\n",
    "n_ft_examples = 2000\n",
    "\n",
    "# fine-tune a pretrained model stored at pretrained_model_path and pretrained_vocab_path \n",
    "dm_finetune(input_file_path=\"./inputs/input_dfm_lstm_model_A.yaml\", \n",
    "            dataset_path=\"./dataset/ocr_trainval.txt\", \n",
    "            model_name=f\"wikigaz_en_ft_ocr_lstm_v001_n{n_ft_examples}\",\n",
    "            pretrained_model_path=\"./models/wikigaz_en_lstm_001/wikigaz_en_lstm_001.model\", \n",
    "            pretrained_vocab_path=\"./models/wikigaz_en_lstm_001/wikigaz_en_lstm_001.vocab\",\n",
    "            n_train_examples=n_ft_examples\n",
    "           )"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 29,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:42:40\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mread input file: ./inputs/input_dfm_lstm_model_A.yaml\u001b[0m\n",
      "\u001b[92m2020-09-10 17:42:40\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32mpytorch will use: cuda:1\u001b[0m\n",
      "\u001b[92m2020-09-10 17:42:40\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mread CSV file: ./dataset/ocr_trainval.txt\u001b[0m\n",
      "\u001b[92m2020-09-10 17:42:40\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32mnumber of labels, True: 42301 and False: 42302\u001b[0m\n",
      "\u001b[92m2020-09-10 17:42:40\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mSplitting the Dataset\u001b[0m\n",
      "\u001b[92m2020-09-10 17:42:40\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mfinish splitting the Dataset. User time: 0.05044984817504883\u001b[0m\n",
      "\u001b[92m2020-09-10 17:42:40\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msplits are as follow:\n",
      "not_assigned    55221\n",
      "val             25380\n",
      "train            4000\n",
      "test                2\n",
      "Name: split, dtype: int64\u001b[0m\n",
      "\u001b[92m2020-09-10 17:42:40\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mstart creating a lookup table and convert characters to indices\u001b[0m\n",
      "\u001b[92m2020-09-10 17:42:41\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32m-- create vocabulary\u001b[0m\n",
      "\u001b[92m2020-09-10 17:42:42\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32m-- convert tokens to indices\u001b[0m\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "                                                    "
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:42:42\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mskipping 0 lines\u001b[0m\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "                                                                     \r"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n",
      "============================================================\n",
      "List all parameters in the model\n",
      "============================================================\n",
      "emb.weight False\n",
      "rnn_1.weight_ih_l0 False\n",
      "rnn_1.weight_hh_l0 False\n",
      "rnn_1.bias_ih_l0 False\n",
      "rnn_1.bias_hh_l0 False\n",
      "rnn_1.weight_ih_l0_reverse False\n",
      "rnn_1.weight_hh_l0_reverse False\n",
      "rnn_1.bias_ih_l0_reverse False\n",
      "rnn_1.bias_hh_l0_reverse False\n",
      "rnn_1.weight_ih_l1 False\n",
      "rnn_1.weight_hh_l1 False\n",
      "rnn_1.bias_ih_l1 False\n",
      "rnn_1.bias_hh_l1 False\n",
      "rnn_1.weight_ih_l1_reverse False\n",
      "rnn_1.weight_hh_l1_reverse False\n",
      "rnn_1.bias_ih_l1_reverse False\n",
      "rnn_1.bias_hh_l1_reverse False\n",
      "attn_step1.weight False\n",
      "attn_step1.bias False\n",
      "attn_step2.weight False\n",
      "attn_step2.bias False\n",
      "fc1.weight True\n",
      "fc1.bias True\n",
      "fc2.weight True\n",
      "fc2.bias True\n",
      "============================================================\n",
      "\n",
      "\n",
      "\n",
      "\u001b[92m2020-09-10 17:42:43\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m******************************\u001b[0m\n",
      "\u001b[92m2020-09-10 17:42:43\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m**** (Bi-directional) LSTM ****\u001b[0m\n",
      "\u001b[92m2020-09-10 17:42:43\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m******************************\u001b[0m\n",
      "\u001b[92m2020-09-10 17:42:43\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mNumber of batches: 63\u001b[0m\n",
      "\u001b[92m2020-09-10 17:42:43\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mNumber of epochs: 20\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "7ca31ac181d74e3bb56cdee2a203cba5",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=20.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=63.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n",
      "\n",
      "====================\n",
      "Total number of params: 721323\n",
      "\n",
      "two_parallel_rnns (\n",
      "  (emb): Embedding(7542, 60), weights=((7542, 60),), parameters=452520\n",
      "  (rnn_1): LSTM(60, 60, num_layers=2, dropout=0.01, bidirectional=True), weights=((240, 60), (240, 60), (240,), (240,), (240, 60), (240, 60), (240,), (240,), (240, 120), (240, 60), (240,), (240,), (240, 120), (240, 60), (240,), (240,)), parameters=145920\n",
      "  (attn_step1): Linear(in_features=120, out_features=60, bias=True), weights=((60, 120), (60,)), parameters=7260\n",
      "  (attn_step2): Linear(in_features=60, out_features=1, bias=True), weights=((1, 60), (1,)), parameters=61\n",
      "  (fc1): Linear(in_features=960, out_features=120, bias=True), weights=((120, 960), (120,)), parameters=115320\n",
      "  (fc2): Linear(in_features=120, out_features=2, bias=True), weights=((2, 120), (2,)), parameters=242\n",
      ")\n",
      "====================\n",
      "\n",
      "\n",
      "\u001b[92m2020-09-10 17:42:46\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_17:42:46 -- Epoch: 1/20; Train; loss: 1.086; acc: 0.581; precision: 0.577, recall: 0.607, macrof1: 0.580, weightedf1: 0.580\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:42:56\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_17:42:56 -- Epoch: 1/20; Valid; loss: 0.793; acc: 0.637; precision: 0.641, recall: 0.624, macrof1: 0.637, weightedf1: 0.637\u001b[0m\n",
      "\u001b[92m2020-09-10 17:42:56\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=63.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:42:58\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_17:42:58 -- Epoch: 2/20; Train; loss: 0.556; acc: 0.736; precision: 0.732, recall: 0.745, macrof1: 0.736, weightedf1: 0.736\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:43:08\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_17:43:08 -- Epoch: 2/20; Valid; loss: 0.577; acc: 0.728; precision: 0.735, recall: 0.715, macrof1: 0.728, weightedf1: 0.728\u001b[0m\n",
      "\u001b[92m2020-09-10 17:43:08\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=63.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:43:10\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_17:43:10 -- Epoch: 3/20; Train; loss: 0.412; acc: 0.819; precision: 0.814, recall: 0.828, macrof1: 0.819, weightedf1: 0.819\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:43:20\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_17:43:20 -- Epoch: 3/20; Valid; loss: 0.492; acc: 0.779; precision: 0.773, recall: 0.790, macrof1: 0.779, weightedf1: 0.779\u001b[0m\n",
      "\u001b[92m2020-09-10 17:43:20\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=63.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:43:22\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_17:43:22 -- Epoch: 4/20; Train; loss: 0.337; acc: 0.864; precision: 0.855, recall: 0.876, macrof1: 0.864, weightedf1: 0.864\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:43:31\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_17:43:31 -- Epoch: 4/20; Valid; loss: 0.455; acc: 0.801; precision: 0.791, recall: 0.819, macrof1: 0.801, weightedf1: 0.801\u001b[0m\n",
      "\u001b[92m2020-09-10 17:43:31\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=63.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:43:33\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_17:43:33 -- Epoch: 5/20; Train; loss: 0.282; acc: 0.895; precision: 0.889, recall: 0.902, macrof1: 0.895, weightedf1: 0.895\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:43:41\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_17:43:41 -- Epoch: 5/20; Valid; loss: 0.431; acc: 0.813; precision: 0.793, recall: 0.848, macrof1: 0.813, weightedf1: 0.813\u001b[0m\n",
      "\u001b[92m2020-09-10 17:43:41\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=63.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:43:43\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_17:43:43 -- Epoch: 6/20; Train; loss: 0.242; acc: 0.914; precision: 0.899, recall: 0.933, macrof1: 0.914, weightedf1: 0.914\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:43:52\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_17:43:52 -- Epoch: 6/20; Valid; loss: 0.424; acc: 0.824; precision: 0.837, recall: 0.804, macrof1: 0.824, weightedf1: 0.824\u001b[0m\n",
      "\u001b[92m2020-09-10 17:43:52\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=63.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:43:54\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_17:43:54 -- Epoch: 7/20; Train; loss: 0.205; acc: 0.932; precision: 0.925, recall: 0.941, macrof1: 0.932, weightedf1: 0.932\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:44:02\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_17:44:02 -- Epoch: 7/20; Valid; loss: 0.406; acc: 0.830; precision: 0.826, recall: 0.837, macrof1: 0.830, weightedf1: 0.830\u001b[0m\n",
      "\u001b[92m2020-09-10 17:44:02\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=63.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:44:04\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_17:44:04 -- Epoch: 8/20; Train; loss: 0.176; acc: 0.950; precision: 0.941, recall: 0.960, macrof1: 0.950, weightedf1: 0.950\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:44:13\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_17:44:13 -- Epoch: 8/20; Valid; loss: 0.407; acc: 0.834; precision: 0.836, recall: 0.831, macrof1: 0.834, weightedf1: 0.834\u001b[0m\n",
      "\u001b[92m2020-09-10 17:44:13\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model (early stopped) with least valid loss (checkpoint: 7) at ./models/wikigaz_en_ft_ocr_lstm_v001_n4000/wikigaz_en_ft_ocr_lstm_v001_n4000.model\u001b[0m\n",
      "\u001b[92m2020-09-10 17:44:13\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n",
      "\u001b[92m2020-09-10 17:44:13\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mEarly stopping at epoch: 8, selected epoch: 7\u001b[0m\n",
      "\n",
      "\n",
      "\n",
      "\n",
      "====================\n",
      "User time: 90.2244\n",
      "====================\n"
     ]
    }
   ],
   "source": [
    "from DeezyMatch import finetune as dm_finetune\n",
    "\n",
    "n_ft_examples = 4000\n",
    "\n",
    "# fine-tune a pretrained model stored at pretrained_model_path and pretrained_vocab_path \n",
    "dm_finetune(input_file_path=\"./inputs/input_dfm_lstm_model_A.yaml\", \n",
    "            dataset_path=\"./dataset/ocr_trainval.txt\", \n",
    "            model_name=f\"wikigaz_en_ft_ocr_lstm_v001_n{n_ft_examples}\",\n",
    "            pretrained_model_path=\"./models/wikigaz_en_lstm_001/wikigaz_en_lstm_001.model\", \n",
    "            pretrained_vocab_path=\"./models/wikigaz_en_lstm_001/wikigaz_en_lstm_001.vocab\",\n",
    "            n_train_examples=n_ft_examples\n",
    "           )"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 30,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:44:13\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mread input file: ./inputs/input_dfm_lstm_model_A.yaml\u001b[0m\n",
      "\u001b[92m2020-09-10 17:44:13\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32mpytorch will use: cuda:1\u001b[0m\n",
      "\u001b[92m2020-09-10 17:44:13\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mread CSV file: ./dataset/ocr_trainval.txt\u001b[0m\n",
      "\u001b[92m2020-09-10 17:44:14\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32mnumber of labels, True: 42301 and False: 42302\u001b[0m\n",
      "\u001b[92m2020-09-10 17:44:14\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mSplitting the Dataset\u001b[0m\n",
      "\u001b[92m2020-09-10 17:44:14\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mfinish splitting the Dataset. User time: 0.049421072006225586\u001b[0m\n",
      "\u001b[92m2020-09-10 17:44:14\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msplits are as follow:\n",
      "not_assigned    51221\n",
      "val             25380\n",
      "train            8000\n",
      "test                2\n",
      "Name: split, dtype: int64\u001b[0m\n",
      "\u001b[92m2020-09-10 17:44:14\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mstart creating a lookup table and convert characters to indices\u001b[0m\n",
      "\u001b[92m2020-09-10 17:44:14\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32m-- create vocabulary\u001b[0m\n",
      "\u001b[92m2020-09-10 17:44:15\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32m-- convert tokens to indices\u001b[0m\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "length s1:   0%|          | 0/25380 [00:00<?, ?it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:44:15\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mskipping 0 lines\u001b[0m\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "                                                                     \r"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n",
      "============================================================\n",
      "List all parameters in the model\n",
      "============================================================\n",
      "emb.weight False\n",
      "rnn_1.weight_ih_l0 False\n",
      "rnn_1.weight_hh_l0 False\n",
      "rnn_1.bias_ih_l0 False\n",
      "rnn_1.bias_hh_l0 False\n",
      "rnn_1.weight_ih_l0_reverse False\n",
      "rnn_1.weight_hh_l0_reverse False\n",
      "rnn_1.bias_ih_l0_reverse False\n",
      "rnn_1.bias_hh_l0_reverse False\n",
      "rnn_1.weight_ih_l1 False\n",
      "rnn_1.weight_hh_l1 False\n",
      "rnn_1.bias_ih_l1 False\n",
      "rnn_1.bias_hh_l1 False\n",
      "rnn_1.weight_ih_l1_reverse False\n",
      "rnn_1.weight_hh_l1_reverse False\n",
      "rnn_1.bias_ih_l1_reverse False\n",
      "rnn_1.bias_hh_l1_reverse False\n",
      "attn_step1.weight False\n",
      "attn_step1.bias False\n",
      "attn_step2.weight False\n",
      "attn_step2.bias False\n",
      "fc1.weight True\n",
      "fc1.bias True\n",
      "fc2.weight True\n",
      "fc2.bias True\n",
      "============================================================\n",
      "\n",
      "\n",
      "\n",
      "\u001b[92m2020-09-10 17:44:16\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m******************************\u001b[0m\n",
      "\u001b[92m2020-09-10 17:44:16\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m**** (Bi-directional) LSTM ****\u001b[0m\n",
      "\u001b[92m2020-09-10 17:44:16\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m******************************\u001b[0m\n",
      "\u001b[92m2020-09-10 17:44:16\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mNumber of batches: 125\u001b[0m\n",
      "\u001b[92m2020-09-10 17:44:16\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mNumber of epochs: 20\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "9435ec3c0eb142dab9ebcd50d8456a86",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=20.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=125.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n",
      "\n",
      "====================\n",
      "Total number of params: 721323\n",
      "\n",
      "two_parallel_rnns (\n",
      "  (emb): Embedding(7542, 60), weights=((7542, 60),), parameters=452520\n",
      "  (rnn_1): LSTM(60, 60, num_layers=2, dropout=0.01, bidirectional=True), weights=((240, 60), (240, 60), (240,), (240,), (240, 60), (240, 60), (240,), (240,), (240, 120), (240, 60), (240,), (240,), (240, 120), (240, 60), (240,), (240,)), parameters=145920\n",
      "  (attn_step1): Linear(in_features=120, out_features=60, bias=True), weights=((60, 120), (60,)), parameters=7260\n",
      "  (attn_step2): Linear(in_features=60, out_features=1, bias=True), weights=((1, 60), (1,)), parameters=61\n",
      "  (fc1): Linear(in_features=960, out_features=120, bias=True), weights=((120, 960), (120,)), parameters=115320\n",
      "  (fc2): Linear(in_features=120, out_features=2, bias=True), weights=((2, 120), (2,)), parameters=242\n",
      ")\n",
      "====================\n",
      "\n",
      "\n",
      "\u001b[92m2020-09-10 17:44:20\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_17:44:20 -- Epoch: 1/20; Train; loss: 0.865; acc: 0.637; precision: 0.631, recall: 0.659, macrof1: 0.637, weightedf1: 0.637\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:44:28\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_17:44:28 -- Epoch: 1/20; Valid; loss: 0.557; acc: 0.737; precision: 0.754, recall: 0.704, macrof1: 0.737, weightedf1: 0.737\u001b[0m\n",
      "\u001b[92m2020-09-10 17:44:28\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=125.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:44:32\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_17:44:32 -- Epoch: 2/20; Train; loss: 0.437; acc: 0.805; precision: 0.801, recall: 0.811, macrof1: 0.805, weightedf1: 0.805\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:44:41\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_17:44:41 -- Epoch: 2/20; Valid; loss: 0.426; acc: 0.813; precision: 0.804, recall: 0.827, macrof1: 0.813, weightedf1: 0.813\u001b[0m\n",
      "\u001b[92m2020-09-10 17:44:41\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=125.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:44:44\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_17:44:44 -- Epoch: 3/20; Train; loss: 0.344; acc: 0.859; precision: 0.851, recall: 0.871, macrof1: 0.859, weightedf1: 0.859\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:44:53\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_17:44:53 -- Epoch: 3/20; Valid; loss: 0.378; acc: 0.841; precision: 0.831, recall: 0.856, macrof1: 0.841, weightedf1: 0.841\u001b[0m\n",
      "\u001b[92m2020-09-10 17:44:53\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=125.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:44:57\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_17:44:57 -- Epoch: 4/20; Train; loss: 0.288; acc: 0.890; precision: 0.880, recall: 0.902, macrof1: 0.890, weightedf1: 0.890\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:45:05\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_17:45:05 -- Epoch: 4/20; Valid; loss: 0.351; acc: 0.854; precision: 0.848, recall: 0.863, macrof1: 0.854, weightedf1: 0.854\u001b[0m\n",
      "\u001b[92m2020-09-10 17:45:05\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=125.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:45:09\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_17:45:09 -- Epoch: 5/20; Train; loss: 0.242; acc: 0.911; precision: 0.899, recall: 0.925, macrof1: 0.911, weightedf1: 0.911\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:45:18\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_17:45:18 -- Epoch: 5/20; Valid; loss: 0.340; acc: 0.862; precision: 0.859, recall: 0.865, macrof1: 0.862, weightedf1: 0.862\u001b[0m\n",
      "\u001b[92m2020-09-10 17:45:18\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=125.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:45:21\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_17:45:21 -- Epoch: 6/20; Train; loss: 0.206; acc: 0.928; precision: 0.920, recall: 0.939, macrof1: 0.928, weightedf1: 0.928\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:45:30\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_17:45:30 -- Epoch: 6/20; Valid; loss: 0.345; acc: 0.862; precision: 0.885, recall: 0.832, macrof1: 0.862, weightedf1: 0.862\u001b[0m\n",
      "\u001b[92m2020-09-10 17:45:30\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model (early stopped) with least valid loss (checkpoint: 5) at ./models/wikigaz_en_ft_ocr_lstm_v001_n8000/wikigaz_en_ft_ocr_lstm_v001_n8000.model\u001b[0m\n",
      "\u001b[92m2020-09-10 17:45:30\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n",
      "\u001b[92m2020-09-10 17:45:30\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mEarly stopping at epoch: 6, selected epoch: 5\u001b[0m\n",
      "\n",
      "\n",
      "\n",
      "\n",
      "====================\n",
      "User time: 73.9374\n",
      "====================\n"
     ]
    }
   ],
   "source": [
    "from DeezyMatch import finetune as dm_finetune\n",
    "\n",
    "n_ft_examples = 8000\n",
    "\n",
    "# fine-tune a pretrained model stored at pretrained_model_path and pretrained_vocab_path \n",
    "dm_finetune(input_file_path=\"./inputs/input_dfm_lstm_model_A.yaml\", \n",
    "            dataset_path=\"./dataset/ocr_trainval.txt\", \n",
    "            model_name=f\"wikigaz_en_ft_ocr_lstm_v001_n{n_ft_examples}\",\n",
    "            pretrained_model_path=\"./models/wikigaz_en_lstm_001/wikigaz_en_lstm_001.model\", \n",
    "            pretrained_vocab_path=\"./models/wikigaz_en_lstm_001/wikigaz_en_lstm_001.vocab\",\n",
    "            n_train_examples=n_ft_examples\n",
    "           )"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 31,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:45:30\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mread input file: ./inputs/input_dfm_lstm_model_A.yaml\u001b[0m\n",
      "\u001b[92m2020-09-10 17:45:30\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32mpytorch will use: cuda:1\u001b[0m\n",
      "\u001b[92m2020-09-10 17:45:30\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mread CSV file: ./dataset/ocr_trainval.txt\u001b[0m\n",
      "\u001b[92m2020-09-10 17:45:30\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32mnumber of labels, True: 42301 and False: 42302\u001b[0m\n",
      "\u001b[92m2020-09-10 17:45:30\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mSplitting the Dataset\u001b[0m\n",
      "\u001b[92m2020-09-10 17:45:30\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mfinish splitting the Dataset. User time: 0.0449368953704834\u001b[0m\n",
      "\u001b[92m2020-09-10 17:45:31\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msplits are as follow:\n",
      "not_assigned    43221\n",
      "val             25380\n",
      "train           16000\n",
      "test                2\n",
      "Name: split, dtype: int64\u001b[0m\n",
      "\u001b[92m2020-09-10 17:45:31\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mstart creating a lookup table and convert characters to indices\u001b[0m\n",
      "\u001b[92m2020-09-10 17:45:31\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32m-- create vocabulary\u001b[0m\n",
      "\u001b[92m2020-09-10 17:45:32\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32m-- convert tokens to indices\u001b[0m\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "s1 padding:   0%|          | 0/16000 [00:00<?, ?it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:45:32\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mskipping 0 lines\u001b[0m\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "                                                                     \r"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n",
      "============================================================\n",
      "List all parameters in the model\n",
      "============================================================\n",
      "emb.weight False\n",
      "rnn_1.weight_ih_l0 False\n",
      "rnn_1.weight_hh_l0 False\n",
      "rnn_1.bias_ih_l0 False\n",
      "rnn_1.bias_hh_l0 False\n",
      "rnn_1.weight_ih_l0_reverse False\n",
      "rnn_1.weight_hh_l0_reverse False\n",
      "rnn_1.bias_ih_l0_reverse False\n",
      "rnn_1.bias_hh_l0_reverse False\n",
      "rnn_1.weight_ih_l1 False\n",
      "rnn_1.weight_hh_l1 False\n",
      "rnn_1.bias_ih_l1 False\n",
      "rnn_1.bias_hh_l1 False\n",
      "rnn_1.weight_ih_l1_reverse False\n",
      "rnn_1.weight_hh_l1_reverse False\n",
      "rnn_1.bias_ih_l1_reverse False\n",
      "rnn_1.bias_hh_l1_reverse False\n",
      "attn_step1.weight False\n",
      "attn_step1.bias False\n",
      "attn_step2.weight False\n",
      "attn_step2.bias False\n",
      "fc1.weight True\n",
      "fc1.bias True\n",
      "fc2.weight True\n",
      "fc2.bias True\n",
      "============================================================\n",
      "\n",
      "\n",
      "\n",
      "\u001b[92m2020-09-10 17:45:33\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m******************************\u001b[0m\n",
      "\u001b[92m2020-09-10 17:45:33\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m**** (Bi-directional) LSTM ****\u001b[0m\n",
      "\u001b[92m2020-09-10 17:45:33\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m******************************\u001b[0m\n",
      "\u001b[92m2020-09-10 17:45:33\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mNumber of batches: 250\u001b[0m\n",
      "\u001b[92m2020-09-10 17:45:33\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mNumber of epochs: 20\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "9f3710fb9ea741a9898554d33b7caa4c",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=20.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=250.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n",
      "\n",
      "====================\n",
      "Total number of params: 721323\n",
      "\n",
      "two_parallel_rnns (\n",
      "  (emb): Embedding(7542, 60), weights=((7542, 60),), parameters=452520\n",
      "  (rnn_1): LSTM(60, 60, num_layers=2, dropout=0.01, bidirectional=True), weights=((240, 60), (240, 60), (240,), (240,), (240, 60), (240, 60), (240,), (240,), (240, 120), (240, 60), (240,), (240,), (240, 120), (240, 60), (240,), (240,)), parameters=145920\n",
      "  (attn_step1): Linear(in_features=120, out_features=60, bias=True), weights=((60, 120), (60,)), parameters=7260\n",
      "  (attn_step2): Linear(in_features=60, out_features=1, bias=True), weights=((1, 60), (1,)), parameters=61\n",
      "  (fc1): Linear(in_features=960, out_features=120, bias=True), weights=((120, 960), (120,)), parameters=115320\n",
      "  (fc2): Linear(in_features=120, out_features=2, bias=True), weights=((2, 120), (2,)), parameters=242\n",
      ")\n",
      "====================\n",
      "\n",
      "\n",
      "\u001b[92m2020-09-10 17:45:40\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_17:45:40 -- Epoch: 1/20; Train; loss: 0.679; acc: 0.710; precision: 0.705, recall: 0.721, macrof1: 0.710, weightedf1: 0.710\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:45:49\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_17:45:49 -- Epoch: 1/20; Valid; loss: 0.419; acc: 0.817; precision: 0.816, recall: 0.819, macrof1: 0.817, weightedf1: 0.817\u001b[0m\n",
      "\u001b[92m2020-09-10 17:45:49\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=250.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:45:56\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_17:45:56 -- Epoch: 2/20; Train; loss: 0.349; acc: 0.855; precision: 0.846, recall: 0.867, macrof1: 0.855, weightedf1: 0.855\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:46:05\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_17:46:05 -- Epoch: 2/20; Valid; loss: 0.343; acc: 0.857; precision: 0.842, recall: 0.879, macrof1: 0.857, weightedf1: 0.857\u001b[0m\n",
      "\u001b[92m2020-09-10 17:46:05\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=250.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:46:12\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_17:46:12 -- Epoch: 3/20; Train; loss: 0.280; acc: 0.892; precision: 0.882, recall: 0.907, macrof1: 0.892, weightedf1: 0.892\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:46:21\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_17:46:21 -- Epoch: 3/20; Valid; loss: 0.321; acc: 0.867; precision: 0.886, recall: 0.842, macrof1: 0.866, weightedf1: 0.866\u001b[0m\n",
      "\u001b[92m2020-09-10 17:46:21\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=250.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:46:28\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_17:46:28 -- Epoch: 4/20; Train; loss: 0.234; acc: 0.914; precision: 0.905, recall: 0.924, macrof1: 0.914, weightedf1: 0.914\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:46:37\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_17:46:37 -- Epoch: 4/20; Valid; loss: 0.300; acc: 0.877; precision: 0.858, recall: 0.905, macrof1: 0.877, weightedf1: 0.877\u001b[0m\n",
      "\u001b[92m2020-09-10 17:46:37\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=250.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:46:45\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_17:46:45 -- Epoch: 5/20; Train; loss: 0.197; acc: 0.929; precision: 0.920, recall: 0.939, macrof1: 0.929, weightedf1: 0.929\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:46:53\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_17:46:53 -- Epoch: 5/20; Valid; loss: 0.290; acc: 0.883; precision: 0.888, recall: 0.877, macrof1: 0.883, weightedf1: 0.883\u001b[0m\n",
      "\u001b[92m2020-09-10 17:46:53\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=250.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:47:01\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_17:47:01 -- Epoch: 6/20; Train; loss: 0.166; acc: 0.943; precision: 0.935, recall: 0.952, macrof1: 0.943, weightedf1: 0.943\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:47:09\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_17:47:09 -- Epoch: 6/20; Valid; loss: 0.285; acc: 0.885; precision: 0.879, recall: 0.894, macrof1: 0.885, weightedf1: 0.885\u001b[0m\n",
      "\u001b[92m2020-09-10 17:47:09\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=250.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:47:17\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_17:47:17 -- Epoch: 7/20; Train; loss: 0.137; acc: 0.955; precision: 0.947, recall: 0.963, macrof1: 0.955, weightedf1: 0.955\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:47:25\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_17:47:25 -- Epoch: 7/20; Valid; loss: 0.291; acc: 0.888; precision: 0.865, recall: 0.920, macrof1: 0.888, weightedf1: 0.888\u001b[0m\n",
      "\u001b[92m2020-09-10 17:47:25\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model (early stopped) with least valid loss (checkpoint: 6) at ./models/wikigaz_en_ft_ocr_lstm_v001_n16000/wikigaz_en_ft_ocr_lstm_v001_n16000.model\u001b[0m\n",
      "\u001b[92m2020-09-10 17:47:25\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n",
      "\u001b[92m2020-09-10 17:47:25\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mEarly stopping at epoch: 7, selected epoch: 6\u001b[0m\n",
      "\n",
      "\n",
      "\n",
      "\n",
      "====================\n",
      "User time: 112.5512\n",
      "====================\n"
     ]
    }
   ],
   "source": [
    "from DeezyMatch import finetune as dm_finetune\n",
    "\n",
    "n_ft_examples = 16000\n",
    "\n",
    "# fine-tune a pretrained model stored at pretrained_model_path and pretrained_vocab_path \n",
    "dm_finetune(input_file_path=\"./inputs/input_dfm_lstm_model_A.yaml\", \n",
    "            dataset_path=\"./dataset/ocr_trainval.txt\", \n",
    "            model_name=f\"wikigaz_en_ft_ocr_lstm_v001_n{n_ft_examples}\",\n",
    "            pretrained_model_path=\"./models/wikigaz_en_lstm_001/wikigaz_en_lstm_001.model\", \n",
    "            pretrained_vocab_path=\"./models/wikigaz_en_lstm_001/wikigaz_en_lstm_001.vocab\",\n",
    "            n_train_examples=n_ft_examples\n",
    "           )"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 32,
   "metadata": {
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:47:25\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mread input file: ./inputs/input_dfm_lstm_model_A.yaml\u001b[0m\n",
      "\u001b[92m2020-09-10 17:47:25\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32mpytorch will use: cuda:1\u001b[0m\n",
      "\u001b[92m2020-09-10 17:47:25\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mread CSV file: ./dataset/ocr_trainval.txt\u001b[0m\n",
      "\u001b[92m2020-09-10 17:47:26\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32mnumber of labels, True: 42301 and False: 42302\u001b[0m\n",
      "\u001b[92m2020-09-10 17:47:26\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mSplitting the Dataset\u001b[0m\n",
      "\u001b[92m2020-09-10 17:47:26\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mfinish splitting the Dataset. User time: 0.04964113235473633\u001b[0m\n",
      "\u001b[92m2020-09-10 17:47:26\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msplits are as follow:\n",
      "train           32000\n",
      "not_assigned    27221\n",
      "val             25380\n",
      "test                2\n",
      "Name: split, dtype: int64\u001b[0m\n",
      "\u001b[92m2020-09-10 17:47:26\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mstart creating a lookup table and convert characters to indices\u001b[0m\n",
      "\u001b[92m2020-09-10 17:47:26\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32m-- create vocabulary\u001b[0m\n",
      "\u001b[92m2020-09-10 17:47:27\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32m-- convert tokens to indices\u001b[0m\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "s1 padding:   0%|          | 0/32000 [00:00<?, ?it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:47:28\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mskipping 0 lines\u001b[0m\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "                                                                     \r"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n",
      "============================================================\n",
      "List all parameters in the model\n",
      "============================================================\n",
      "emb.weight False\n",
      "rnn_1.weight_ih_l0 False\n",
      "rnn_1.weight_hh_l0 False\n",
      "rnn_1.bias_ih_l0 False\n",
      "rnn_1.bias_hh_l0 False\n",
      "rnn_1.weight_ih_l0_reverse False\n",
      "rnn_1.weight_hh_l0_reverse False\n",
      "rnn_1.bias_ih_l0_reverse False\n",
      "rnn_1.bias_hh_l0_reverse False\n",
      "rnn_1.weight_ih_l1 False\n",
      "rnn_1.weight_hh_l1 False\n",
      "rnn_1.bias_ih_l1 False\n",
      "rnn_1.bias_hh_l1 False\n",
      "rnn_1.weight_ih_l1_reverse False\n",
      "rnn_1.weight_hh_l1_reverse False\n",
      "rnn_1.bias_ih_l1_reverse False\n",
      "rnn_1.bias_hh_l1_reverse False\n",
      "attn_step1.weight False\n",
      "attn_step1.bias False\n",
      "attn_step2.weight False\n",
      "attn_step2.bias False\n",
      "fc1.weight True\n",
      "fc1.bias True\n",
      "fc2.weight True\n",
      "fc2.bias True\n",
      "============================================================\n",
      "\n",
      "\n",
      "\n",
      "\u001b[92m2020-09-10 17:47:29\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m******************************\u001b[0m\n",
      "\u001b[92m2020-09-10 17:47:29\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m**** (Bi-directional) LSTM ****\u001b[0m\n",
      "\u001b[92m2020-09-10 17:47:29\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m******************************\u001b[0m\n",
      "\u001b[92m2020-09-10 17:47:29\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mNumber of batches: 500\u001b[0m\n",
      "\u001b[92m2020-09-10 17:47:29\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mNumber of epochs: 20\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "19e3df35a9cf4496b73d60b00fba3472",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=20.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=500.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n",
      "\n",
      "====================\n",
      "Total number of params: 721323\n",
      "\n",
      "two_parallel_rnns (\n",
      "  (emb): Embedding(7542, 60), weights=((7542, 60),), parameters=452520\n",
      "  (rnn_1): LSTM(60, 60, num_layers=2, dropout=0.01, bidirectional=True), weights=((240, 60), (240, 60), (240,), (240,), (240, 60), (240, 60), (240,), (240,), (240, 120), (240, 60), (240,), (240,), (240, 120), (240, 60), (240,), (240,)), parameters=145920\n",
      "  (attn_step1): Linear(in_features=120, out_features=60, bias=True), weights=((60, 120), (60,)), parameters=7260\n",
      "  (attn_step2): Linear(in_features=60, out_features=1, bias=True), weights=((1, 60), (1,)), parameters=61\n",
      "  (fc1): Linear(in_features=960, out_features=120, bias=True), weights=((120, 960), (120,)), parameters=115320\n",
      "  (fc2): Linear(in_features=120, out_features=2, bias=True), weights=((2, 120), (2,)), parameters=242\n",
      ")\n",
      "====================\n",
      "\n",
      "\n",
      "\u001b[92m2020-09-10 17:47:44\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_17:47:44 -- Epoch: 1/20; Train; loss: 0.522; acc: 0.779; precision: 0.772, recall: 0.792, macrof1: 0.779, weightedf1: 0.779\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:47:52\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_17:47:52 -- Epoch: 1/20; Valid; loss: 0.338; acc: 0.860; precision: 0.854, recall: 0.869, macrof1: 0.860, weightedf1: 0.860\u001b[0m\n",
      "\u001b[92m2020-09-10 17:47:52\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=500.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:48:08\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_17:48:08 -- Epoch: 2/20; Train; loss: 0.286; acc: 0.885; precision: 0.876, recall: 0.898, macrof1: 0.885, weightedf1: 0.885\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:48:17\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_17:48:17 -- Epoch: 2/20; Valid; loss: 0.293; acc: 0.878; precision: 0.851, recall: 0.917, macrof1: 0.878, weightedf1: 0.878\u001b[0m\n",
      "\u001b[92m2020-09-10 17:48:17\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=500.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:48:32\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_17:48:32 -- Epoch: 3/20; Train; loss: 0.231; acc: 0.909; precision: 0.900, recall: 0.921, macrof1: 0.909, weightedf1: 0.909\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:48:41\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_17:48:41 -- Epoch: 3/20; Valid; loss: 0.271; acc: 0.890; precision: 0.863, recall: 0.926, macrof1: 0.889, weightedf1: 0.889\u001b[0m\n",
      "\u001b[92m2020-09-10 17:48:41\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=500.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:48:56\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_17:48:56 -- Epoch: 4/20; Train; loss: 0.190; acc: 0.926; precision: 0.918, recall: 0.935, macrof1: 0.926, weightedf1: 0.926\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:49:04\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_17:49:04 -- Epoch: 4/20; Valid; loss: 0.258; acc: 0.899; precision: 0.881, recall: 0.922, macrof1: 0.899, weightedf1: 0.899\u001b[0m\n",
      "\u001b[92m2020-09-10 17:49:04\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=500.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:49:19\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_17:49:19 -- Epoch: 5/20; Train; loss: 0.157; acc: 0.942; precision: 0.933, recall: 0.951, macrof1: 0.942, weightedf1: 0.942\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:49:28\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_17:49:28 -- Epoch: 5/20; Valid; loss: 0.250; acc: 0.904; precision: 0.899, recall: 0.912, macrof1: 0.904, weightedf1: 0.904\u001b[0m\n",
      "\u001b[92m2020-09-10 17:49:28\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=500.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:49:43\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_17:49:43 -- Epoch: 6/20; Train; loss: 0.128; acc: 0.954; precision: 0.947, recall: 0.963, macrof1: 0.954, weightedf1: 0.954\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:49:51\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_17:49:51 -- Epoch: 6/20; Valid; loss: 0.252; acc: 0.906; precision: 0.893, recall: 0.923, macrof1: 0.906, weightedf1: 0.906\u001b[0m\n",
      "\u001b[92m2020-09-10 17:49:51\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model (early stopped) with least valid loss (checkpoint: 5) at ./models/wikigaz_en_ft_ocr_lstm_v001_n32000/wikigaz_en_ft_ocr_lstm_v001_n32000.model\u001b[0m\n",
      "\u001b[92m2020-09-10 17:49:51\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n",
      "\u001b[92m2020-09-10 17:49:51\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mEarly stopping at epoch: 6, selected epoch: 5\u001b[0m\n",
      "\n",
      "\n",
      "\n",
      "\n",
      "====================\n",
      "User time: 142.6903\n",
      "====================\n"
     ]
    }
   ],
   "source": [
    "from DeezyMatch import finetune as dm_finetune\n",
    "\n",
    "n_ft_examples = 32000\n",
    "\n",
    "# fine-tune a pretrained model stored at pretrained_model_path and pretrained_vocab_path \n",
    "dm_finetune(input_file_path=\"./inputs/input_dfm_lstm_model_A.yaml\", \n",
    "            dataset_path=\"./dataset/ocr_trainval.txt\", \n",
    "            model_name=f\"wikigaz_en_ft_ocr_lstm_v001_n{n_ft_examples}\",\n",
    "            pretrained_model_path=\"./models/wikigaz_en_lstm_001/wikigaz_en_lstm_001.model\", \n",
    "            pretrained_vocab_path=\"./models/wikigaz_en_lstm_001/wikigaz_en_lstm_001.vocab\",\n",
    "            n_train_examples=n_ft_examples\n",
    "           )"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 33,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:49:51\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mread input file: ./inputs/input_dfm_lstm_model_A_no_early_stopping.yaml\u001b[0m\n",
      "\u001b[92m2020-09-10 17:49:51\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32mpytorch will use: cuda:1\u001b[0m\n",
      "\u001b[92m2020-09-10 17:49:51\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mread CSV file: ./dataset/ocr_trainval.txt\u001b[0m\n",
      "\u001b[92m2020-09-10 17:49:52\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32mnumber of labels, True: 42301 and False: 42302\u001b[0m\n",
      "\u001b[92m2020-09-10 17:49:52\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mSplitting the Dataset\u001b[0m\n",
      "\u001b[92m2020-09-10 17:49:52\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mfinish splitting the Dataset. User time: 0.0503392219543457\u001b[0m\n",
      "\u001b[92m2020-09-10 17:49:52\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msplits are as follow:\n",
      "train    64000\n",
      "val      20603\n",
      "Name: split, dtype: int64\u001b[0m\n",
      "\u001b[92m2020-09-10 17:49:52\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mstart creating a lookup table and convert characters to indices\u001b[0m\n",
      "\u001b[92m2020-09-10 17:49:52\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32m-- create vocabulary\u001b[0m\n",
      "\u001b[92m2020-09-10 17:49:53\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32m-- convert tokens to indices\u001b[0m\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "length s2:   0%|          | 0/64000 [00:00<?, ?it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:49:54\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mskipping 0 lines\u001b[0m\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "                                                                     \r"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n",
      "============================================================\n",
      "List all parameters in the model\n",
      "============================================================\n",
      "emb.weight False\n",
      "rnn_1.weight_ih_l0 False\n",
      "rnn_1.weight_hh_l0 False\n",
      "rnn_1.bias_ih_l0 False\n",
      "rnn_1.bias_hh_l0 False\n",
      "rnn_1.weight_ih_l0_reverse False\n",
      "rnn_1.weight_hh_l0_reverse False\n",
      "rnn_1.bias_ih_l0_reverse False\n",
      "rnn_1.bias_hh_l0_reverse False\n",
      "rnn_1.weight_ih_l1 False\n",
      "rnn_1.weight_hh_l1 False\n",
      "rnn_1.bias_ih_l1 False\n",
      "rnn_1.bias_hh_l1 False\n",
      "rnn_1.weight_ih_l1_reverse False\n",
      "rnn_1.weight_hh_l1_reverse False\n",
      "rnn_1.bias_ih_l1_reverse False\n",
      "rnn_1.bias_hh_l1_reverse False\n",
      "attn_step1.weight False\n",
      "attn_step1.bias False\n",
      "attn_step2.weight False\n",
      "attn_step2.bias False\n",
      "fc1.weight True\n",
      "fc1.bias True\n",
      "fc2.weight True\n",
      "fc2.bias True\n",
      "============================================================\n",
      "\n",
      "\n",
      "\n",
      "\u001b[92m2020-09-10 17:49:55\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m******************************\u001b[0m\n",
      "\u001b[92m2020-09-10 17:49:55\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m**** (Bi-directional) LSTM ****\u001b[0m\n",
      "\u001b[92m2020-09-10 17:49:55\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m******************************\u001b[0m\n",
      "\u001b[92m2020-09-10 17:49:55\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mNumber of batches: 1000\u001b[0m\n",
      "\u001b[92m2020-09-10 17:49:55\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mNumber of epochs: 10\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "76d1e04741c6488ea19eb02e26d2170b",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=10.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=1000.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n",
      "\n",
      "====================\n",
      "Total number of params: 721323\n",
      "\n",
      "two_parallel_rnns (\n",
      "  (emb): Embedding(7542, 60), weights=((7542, 60),), parameters=452520\n",
      "  (rnn_1): LSTM(60, 60, num_layers=2, dropout=0.01, bidirectional=True), weights=((240, 60), (240, 60), (240,), (240,), (240, 60), (240, 60), (240,), (240,), (240, 120), (240, 60), (240,), (240,), (240, 120), (240, 60), (240,), (240,)), parameters=145920\n",
      "  (attn_step1): Linear(in_features=120, out_features=60, bias=True), weights=((60, 120), (60,)), parameters=7260\n",
      "  (attn_step2): Linear(in_features=60, out_features=1, bias=True), weights=((1, 60), (1,)), parameters=61\n",
      "  (fc1): Linear(in_features=960, out_features=120, bias=True), weights=((120, 960), (120,)), parameters=115320\n",
      "  (fc2): Linear(in_features=120, out_features=2, bias=True), weights=((2, 120), (2,)), parameters=242\n",
      ")\n",
      "====================\n",
      "\n",
      "\n",
      "\u001b[92m2020-09-10 17:50:25\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_17:50:25 -- Epoch: 1/10; Train; loss: 0.419; acc: 0.823; precision: 0.813, recall: 0.840, macrof1: 0.823, weightedf1: 0.823\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=322.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:50:32\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_17:50:32 -- Epoch: 1/10; Valid; loss: 0.279; acc: 0.888; precision: 0.882, recall: 0.895, macrof1: 0.888, weightedf1: 0.888\u001b[0m\n",
      "\u001b[92m2020-09-10 17:50:32\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=1000.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:51:02\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_17:51:02 -- Epoch: 2/10; Train; loss: 0.244; acc: 0.904; precision: 0.894, recall: 0.916, macrof1: 0.904, weightedf1: 0.904\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=322.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:51:08\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_17:51:08 -- Epoch: 2/10; Valid; loss: 0.232; acc: 0.908; precision: 0.902, recall: 0.914, macrof1: 0.908, weightedf1: 0.908\u001b[0m\n",
      "\u001b[92m2020-09-10 17:51:08\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=1000.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:51:39\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_17:51:39 -- Epoch: 3/10; Train; loss: 0.195; acc: 0.925; precision: 0.916, recall: 0.937, macrof1: 0.925, weightedf1: 0.925\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=322.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:51:46\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_17:51:46 -- Epoch: 3/10; Valid; loss: 0.216; acc: 0.914; precision: 0.892, recall: 0.942, macrof1: 0.914, weightedf1: 0.914\u001b[0m\n",
      "\u001b[92m2020-09-10 17:51:46\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=1000.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:52:16\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_17:52:16 -- Epoch: 4/10; Train; loss: 0.161; acc: 0.940; precision: 0.931, recall: 0.949, macrof1: 0.940, weightedf1: 0.940\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=322.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:52:23\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_17:52:23 -- Epoch: 4/10; Valid; loss: 0.204; acc: 0.921; precision: 0.905, recall: 0.941, macrof1: 0.921, weightedf1: 0.921\u001b[0m\n",
      "\u001b[92m2020-09-10 17:52:23\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=1000.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:52:53\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_17:52:53 -- Epoch: 5/10; Train; loss: 0.133; acc: 0.951; precision: 0.942, recall: 0.960, macrof1: 0.951, weightedf1: 0.951\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=322.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:53:00\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_17:53:00 -- Epoch: 5/10; Valid; loss: 0.203; acc: 0.923; precision: 0.913, recall: 0.935, macrof1: 0.923, weightedf1: 0.923\u001b[0m\n",
      "\u001b[92m2020-09-10 17:53:00\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=1000.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:53:30\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_17:53:30 -- Epoch: 6/10; Train; loss: 0.111; acc: 0.959; precision: 0.952, recall: 0.967, macrof1: 0.959, weightedf1: 0.959\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=322.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:53:37\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_17:53:37 -- Epoch: 6/10; Valid; loss: 0.208; acc: 0.925; precision: 0.922, recall: 0.927, macrof1: 0.925, weightedf1: 0.925\u001b[0m\n",
      "\u001b[92m2020-09-10 17:53:37\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=1000.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:54:07\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_17:54:07 -- Epoch: 7/10; Train; loss: 0.093; acc: 0.967; precision: 0.961, recall: 0.973, macrof1: 0.967, weightedf1: 0.967\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=322.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:54:14\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_17:54:14 -- Epoch: 7/10; Valid; loss: 0.209; acc: 0.926; precision: 0.923, recall: 0.928, macrof1: 0.926, weightedf1: 0.926\u001b[0m\n",
      "\u001b[92m2020-09-10 17:54:14\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=1000.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:54:44\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_17:54:44 -- Epoch: 8/10; Train; loss: 0.077; acc: 0.974; precision: 0.968, recall: 0.979, macrof1: 0.974, weightedf1: 0.974\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=322.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:54:51\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_17:54:51 -- Epoch: 8/10; Valid; loss: 0.218; acc: 0.925; precision: 0.914, recall: 0.939, macrof1: 0.925, weightedf1: 0.925\u001b[0m\n",
      "\u001b[92m2020-09-10 17:54:51\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=1000.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:55:21\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_17:55:21 -- Epoch: 9/10; Train; loss: 0.063; acc: 0.979; precision: 0.974, recall: 0.984, macrof1: 0.979, weightedf1: 0.979\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=322.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:55:28\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_17:55:28 -- Epoch: 9/10; Valid; loss: 0.236; acc: 0.924; precision: 0.927, recall: 0.921, macrof1: 0.924, weightedf1: 0.924\u001b[0m\n",
      "\u001b[92m2020-09-10 17:55:28\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=1000.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:55:58\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_17:55:58 -- Epoch: 10/10; Train; loss: 0.052; acc: 0.983; precision: 0.980, recall: 0.986, macrof1: 0.983, weightedf1: 0.983\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=322.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:56:05\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_17:56:05 -- Epoch: 10/10; Valid; loss: 0.250; acc: 0.923; precision: 0.918, recall: 0.929, macrof1: 0.923, weightedf1: 0.923\u001b[0m\n",
      "\u001b[92m2020-09-10 17:56:05\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n",
      "\n",
      "\u001b[92m2020-09-10 17:56:05\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model with least valid loss (checkpoint: 5) at ./models/wikigaz_en_ft_ocr_lstm_v001_n64000/wikigaz_en_ft_ocr_lstm_v001_n64000.model\u001b[0m\n",
      "\n",
      "\n",
      "\n",
      "====================\n",
      "User time: 369.7571\n",
      "====================\n"
     ]
    }
   ],
   "source": [
    "from DeezyMatch import finetune as dm_finetune\n",
    "\n",
    "n_ft_examples = 64000\n",
    "\n",
    "# fine-tune a pretrained model stored at pretrained_model_path and pretrained_vocab_path \n",
    "dm_finetune(input_file_path=\"./inputs/input_dfm_lstm_model_A_no_early_stopping.yaml\", \n",
    "            dataset_path=\"./dataset/ocr_trainval.txt\", \n",
    "            model_name=f\"wikigaz_en_ft_ocr_lstm_v001_n{n_ft_examples}\",\n",
    "            pretrained_model_path=\"./models/wikigaz_en_lstm_001/wikigaz_en_lstm_001.model\", \n",
    "            pretrained_vocab_path=\"./models/wikigaz_en_lstm_001/wikigaz_en_lstm_001.vocab\",\n",
    "            n_train_examples=n_ft_examples\n",
    "           )"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 34,
   "metadata": {
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:56:05\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mread input file: ./inputs/input_dfm_lstm_model_A_no_early_stopping.yaml\u001b[0m\n",
      "\u001b[92m2020-09-10 17:56:05\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32mpytorch will use: cuda:1\u001b[0m\n",
      "\u001b[92m2020-09-10 17:56:05\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mread CSV file: ./dataset/ocr_trainval.txt\u001b[0m\n",
      "\u001b[92m2020-09-10 17:56:05\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32mnumber of labels, True: 42301 and False: 42302\u001b[0m\n",
      "\u001b[92m2020-09-10 17:56:05\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mSplitting the Dataset\u001b[0m\n",
      "\u001b[92m2020-09-10 17:56:05\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mfinish splitting the Dataset. User time: 0.04768061637878418\u001b[0m\n",
      "\u001b[92m2020-09-10 17:56:05\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msplits are as follow:\n",
      "train    84000\n",
      "val        603\n",
      "Name: split, dtype: int64\u001b[0m\n",
      "\u001b[92m2020-09-10 17:56:05\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mstart creating a lookup table and convert characters to indices\u001b[0m\n",
      "\u001b[92m2020-09-10 17:56:05\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32m-- create vocabulary\u001b[0m\n",
      "\u001b[92m2020-09-10 17:56:06\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32m-- convert tokens to indices\u001b[0m\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      "length s1:   0%|          | 0/84000 [00:00<?, ?it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:56:07\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mskipping 0 lines\u001b[0m\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "                                                                     \r"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n",
      "============================================================\n",
      "List all parameters in the model\n",
      "============================================================\n",
      "emb.weight False\n",
      "rnn_1.weight_ih_l0 False\n",
      "rnn_1.weight_hh_l0 False\n",
      "rnn_1.bias_ih_l0 False\n",
      "rnn_1.bias_hh_l0 False\n",
      "rnn_1.weight_ih_l0_reverse False\n",
      "rnn_1.weight_hh_l0_reverse False\n",
      "rnn_1.bias_ih_l0_reverse False\n",
      "rnn_1.bias_hh_l0_reverse False\n",
      "rnn_1.weight_ih_l1 False\n",
      "rnn_1.weight_hh_l1 False\n",
      "rnn_1.bias_ih_l1 False\n",
      "rnn_1.bias_hh_l1 False\n",
      "rnn_1.weight_ih_l1_reverse False\n",
      "rnn_1.weight_hh_l1_reverse False\n",
      "rnn_1.bias_ih_l1_reverse False\n",
      "rnn_1.bias_hh_l1_reverse False\n",
      "attn_step1.weight False\n",
      "attn_step1.bias False\n",
      "attn_step2.weight False\n",
      "attn_step2.bias False\n",
      "fc1.weight True\n",
      "fc1.bias True\n",
      "fc2.weight True\n",
      "fc2.bias True\n",
      "============================================================\n",
      "\n",
      "\n",
      "\n",
      "\u001b[92m2020-09-10 17:56:08\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m******************************\u001b[0m\n",
      "\u001b[92m2020-09-10 17:56:08\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m**** (Bi-directional) LSTM ****\u001b[0m\n",
      "\u001b[92m2020-09-10 17:56:08\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m******************************\u001b[0m\n",
      "\u001b[92m2020-09-10 17:56:08\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mNumber of batches: 1313\u001b[0m\n",
      "\u001b[92m2020-09-10 17:56:08\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mNumber of epochs: 10\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "b875cfb4ebd34bd9a76626fcb0f61d95",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=10.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=1313.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n",
      "\n",
      "====================\n",
      "Total number of params: 721323\n",
      "\n",
      "two_parallel_rnns (\n",
      "  (emb): Embedding(7542, 60), weights=((7542, 60),), parameters=452520\n",
      "  (rnn_1): LSTM(60, 60, num_layers=2, dropout=0.01, bidirectional=True), weights=((240, 60), (240, 60), (240,), (240,), (240, 60), (240, 60), (240,), (240,), (240, 120), (240, 60), (240,), (240,), (240, 120), (240, 60), (240,), (240,)), parameters=145920\n",
      "  (attn_step1): Linear(in_features=120, out_features=60, bias=True), weights=((60, 120), (60,)), parameters=7260\n",
      "  (attn_step2): Linear(in_features=60, out_features=1, bias=True), weights=((1, 60), (1,)), parameters=61\n",
      "  (fc1): Linear(in_features=960, out_features=120, bias=True), weights=((120, 960), (120,)), parameters=115320\n",
      "  (fc2): Linear(in_features=120, out_features=2, bias=True), weights=((2, 120), (2,)), parameters=242\n",
      ")\n",
      "====================\n",
      "\n",
      "\n",
      "\u001b[92m2020-09-10 17:56:47\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_17:56:47 -- Epoch: 1/10; Train; loss: 0.382; acc: 0.841; precision: 0.832, recall: 0.855, macrof1: 0.841, weightedf1: 0.841\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=10.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:56:48\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_17:56:48 -- Epoch: 1/10; Valid; loss: 0.229; acc: 0.907; precision: 0.910, recall: 0.904, macrof1: 0.907, weightedf1: 0.907\u001b[0m\n",
      "\u001b[92m2020-09-10 17:56:48\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=1313.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:57:31\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_17:57:31 -- Epoch: 2/10; Train; loss: 0.223; acc: 0.913; precision: 0.903, recall: 0.926, macrof1: 0.913, weightedf1: 0.913\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=10.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:57:31\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_17:57:31 -- Epoch: 2/10; Valid; loss: 0.206; acc: 0.930; precision: 0.930, recall: 0.930, macrof1: 0.930, weightedf1: 0.930\u001b[0m\n",
      "\u001b[92m2020-09-10 17:57:31\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=1313.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:58:14\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_17:58:14 -- Epoch: 3/10; Train; loss: 0.178; acc: 0.932; precision: 0.922, recall: 0.943, macrof1: 0.932, weightedf1: 0.932\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=10.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:58:14\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_17:58:14 -- Epoch: 3/10; Valid; loss: 0.183; acc: 0.924; precision: 0.907, recall: 0.944, macrof1: 0.924, weightedf1: 0.924\u001b[0m\n",
      "\u001b[92m2020-09-10 17:58:14\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=1313.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:58:54\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_17:58:54 -- Epoch: 4/10; Train; loss: 0.145; acc: 0.945; precision: 0.937, recall: 0.954, macrof1: 0.945, weightedf1: 0.945\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=10.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:58:54\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_17:58:54 -- Epoch: 4/10; Valid; loss: 0.171; acc: 0.934; precision: 0.931, recall: 0.937, macrof1: 0.934, weightedf1: 0.934\u001b[0m\n",
      "\u001b[92m2020-09-10 17:58:54\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=1313.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:59:34\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_17:59:34 -- Epoch: 5/10; Train; loss: 0.123; acc: 0.955; precision: 0.948, recall: 0.962, macrof1: 0.955, weightedf1: 0.955\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=10.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 17:59:34\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_17:59:34 -- Epoch: 5/10; Valid; loss: 0.154; acc: 0.937; precision: 0.931, recall: 0.944, macrof1: 0.937, weightedf1: 0.937\u001b[0m\n",
      "\u001b[92m2020-09-10 17:59:34\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=1313.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:00:14\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_18:00:14 -- Epoch: 6/10; Train; loss: 0.104; acc: 0.962; precision: 0.956, recall: 0.969, macrof1: 0.962, weightedf1: 0.962\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=10.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:00:14\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_18:00:14 -- Epoch: 6/10; Valid; loss: 0.169; acc: 0.935; precision: 0.943, recall: 0.927, macrof1: 0.935, weightedf1: 0.935\u001b[0m\n",
      "\u001b[92m2020-09-10 18:00:14\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=1313.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:00:53\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_18:00:53 -- Epoch: 7/10; Train; loss: 0.088; acc: 0.968; precision: 0.963, recall: 0.974, macrof1: 0.968, weightedf1: 0.968\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=10.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:00:54\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_18:00:54 -- Epoch: 7/10; Valid; loss: 0.183; acc: 0.929; precision: 0.911, recall: 0.950, macrof1: 0.929, weightedf1: 0.929\u001b[0m\n",
      "\u001b[92m2020-09-10 18:00:54\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=1313.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:01:38\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_18:01:38 -- Epoch: 8/10; Train; loss: 0.074; acc: 0.974; precision: 0.969, recall: 0.979, macrof1: 0.974, weightedf1: 0.974\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=10.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:01:38\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_18:01:38 -- Epoch: 8/10; Valid; loss: 0.178; acc: 0.945; precision: 0.944, recall: 0.947, macrof1: 0.945, weightedf1: 0.945\u001b[0m\n",
      "\u001b[92m2020-09-10 18:01:38\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=1313.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:02:21\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_18:02:21 -- Epoch: 9/10; Train; loss: 0.064; acc: 0.978; precision: 0.974, recall: 0.982, macrof1: 0.978, weightedf1: 0.978\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=10.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:02:21\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_18:02:21 -- Epoch: 9/10; Valid; loss: 0.182; acc: 0.935; precision: 0.920, recall: 0.953, macrof1: 0.935, weightedf1: 0.935\u001b[0m\n",
      "\u001b[92m2020-09-10 18:02:21\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=1313.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:03:01\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_18:03:01 -- Epoch: 10/10; Train; loss: 0.054; acc: 0.982; precision: 0.978, recall: 0.986, macrof1: 0.982, weightedf1: 0.982\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=10.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:03:02\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_18:03:02 -- Epoch: 10/10; Valid; loss: 0.199; acc: 0.934; precision: 0.917, recall: 0.953, macrof1: 0.934, weightedf1: 0.934\u001b[0m\n",
      "\u001b[92m2020-09-10 18:03:02\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n",
      "\n",
      "\u001b[92m2020-09-10 18:03:02\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model with least valid loss (checkpoint: 5) at ./models/wikigaz_en_ft_ocr_lstm_v001_n84000/wikigaz_en_ft_ocr_lstm_v001_n84000.model\u001b[0m\n",
      "\n",
      "\n",
      "\n",
      "====================\n",
      "User time: 413.5969\n",
      "====================\n"
     ]
    }
   ],
   "source": [
    "from DeezyMatch import finetune as dm_finetune\n",
    "\n",
    "n_ft_examples = 84000\n",
    "\n",
    "# fine-tune a pretrained model stored at pretrained_model_path and pretrained_vocab_path \n",
    "dm_finetune(input_file_path=\"./inputs/input_dfm_lstm_model_A_no_early_stopping.yaml\", \n",
    "            dataset_path=\"./dataset/ocr_trainval.txt\", \n",
    "            model_name=f\"wikigaz_en_ft_ocr_lstm_v001_n{n_ft_examples}\",\n",
    "            pretrained_model_path=\"./models/wikigaz_en_lstm_001/wikigaz_en_lstm_001.model\", \n",
    "            pretrained_vocab_path=\"./models/wikigaz_en_lstm_001/wikigaz_en_lstm_001.vocab\",\n",
    "            n_train_examples=n_ft_examples\n",
    "           )"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Fine-Tune, model A, RNN"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 35,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:03:02\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mread input file: ./inputs/input_dfm_rnn_model_A.yaml\u001b[0m\n",
      "\u001b[92m2020-09-10 18:03:02\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32mpytorch will use: cuda:1\u001b[0m\n",
      "\u001b[92m2020-09-10 18:03:02\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mread CSV file: ./dataset/ocr_trainval.txt\u001b[0m\n",
      "\u001b[92m2020-09-10 18:03:02\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32mnumber of labels, True: 42301 and False: 42302\u001b[0m\n",
      "\u001b[92m2020-09-10 18:03:02\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mSplitting the Dataset\u001b[0m\n",
      "\u001b[92m2020-09-10 18:03:02\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mfinish splitting the Dataset. User time: 0.0529019832611084\u001b[0m\n",
      "\u001b[92m2020-09-10 18:03:02\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msplits are as follow:\n",
      "not_assigned    58971\n",
      "val             25380\n",
      "train             250\n",
      "test                2\n",
      "Name: split, dtype: int64\u001b[0m\n",
      "\u001b[92m2020-09-10 18:03:02\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mstart creating a lookup table and convert characters to indices\u001b[0m\n",
      "\u001b[92m2020-09-10 18:03:03\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32m-- create vocabulary\u001b[0m\n",
      "\u001b[92m2020-09-10 18:03:04\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32m-- convert tokens to indices\u001b[0m\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "s1 padding:   0%|          | 0/25380 [00:00<?, ?it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:03:04\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mskipping 0 lines\u001b[0m\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "                                                                     \r"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n",
      "============================================================\n",
      "List all parameters in the model\n",
      "============================================================\n",
      "emb.weight False\n",
      "rnn_1.weight_ih_l0 False\n",
      "rnn_1.weight_hh_l0 False\n",
      "rnn_1.bias_ih_l0 False\n",
      "rnn_1.bias_hh_l0 False\n",
      "rnn_1.weight_ih_l0_reverse False\n",
      "rnn_1.weight_hh_l0_reverse False\n",
      "rnn_1.bias_ih_l0_reverse False\n",
      "rnn_1.bias_hh_l0_reverse False\n",
      "rnn_1.weight_ih_l1 False\n",
      "rnn_1.weight_hh_l1 False\n",
      "rnn_1.bias_ih_l1 False\n",
      "rnn_1.bias_hh_l1 False\n",
      "rnn_1.weight_ih_l1_reverse False\n",
      "rnn_1.weight_hh_l1_reverse False\n",
      "rnn_1.bias_ih_l1_reverse False\n",
      "rnn_1.bias_hh_l1_reverse False\n",
      "attn_step1.weight False\n",
      "attn_step1.bias False\n",
      "attn_step2.weight False\n",
      "attn_step2.bias False\n",
      "fc1.weight True\n",
      "fc1.bias True\n",
      "fc2.weight True\n",
      "fc2.bias True\n",
      "============================================================\n",
      "\n",
      "\n",
      "\n",
      "\u001b[92m2020-09-10 18:03:05\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m******************************\u001b[0m\n",
      "\u001b[92m2020-09-10 18:03:05\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m**** (Bi-directional) RNN ****\u001b[0m\n",
      "\u001b[92m2020-09-10 18:03:05\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m******************************\u001b[0m\n",
      "\u001b[92m2020-09-10 18:03:05\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mNumber of batches: 4\u001b[0m\n",
      "\u001b[92m2020-09-10 18:03:05\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mNumber of epochs: 20\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "3fd78002feee4ee5b11f1f6c51d8e326",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=20.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=4.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n",
      "\n",
      "====================\n",
      "Total number of params: 611883\n",
      "\n",
      "two_parallel_rnns (\n",
      "  (emb): Embedding(7542, 60), weights=((7542, 60),), parameters=452520\n",
      "  (rnn_1): RNN(60, 60, num_layers=2, dropout=0.01, bidirectional=True), weights=((60, 60), (60, 60), (60,), (60,), (60, 60), (60, 60), (60,), (60,), (60, 120), (60, 60), (60,), (60,), (60, 120), (60, 60), (60,), (60,)), parameters=36480\n",
      "  (attn_step1): Linear(in_features=120, out_features=60, bias=True), weights=((60, 120), (60,)), parameters=7260\n",
      "  (attn_step2): Linear(in_features=60, out_features=1, bias=True), weights=((1, 60), (1,)), parameters=61\n",
      "  (fc1): Linear(in_features=960, out_features=120, bias=True), weights=((120, 960), (120,)), parameters=115320\n",
      "  (fc2): Linear(in_features=120, out_features=2, bias=True), weights=((2, 120), (2,)), parameters=242\n",
      ")\n",
      "====================\n",
      "\n",
      "\n",
      "\u001b[92m2020-09-10 18:03:05\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_18:03:05 -- Epoch: 1/20; Train; loss: 1.058; acc: 0.520; precision: 0.522, recall: 0.480, macrof1: 0.519, weightedf1: 0.519\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:03:12\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_18:03:12 -- Epoch: 1/20; Valid; loss: 1.051; acc: 0.498; precision: 0.498, recall: 0.514, macrof1: 0.498, weightedf1: 0.498\u001b[0m\n",
      "\u001b[92m2020-09-10 18:03:12\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=4.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:03:13\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_18:03:13 -- Epoch: 2/20; Train; loss: 0.857; acc: 0.564; precision: 0.566, recall: 0.552, macrof1: 0.564, weightedf1: 0.564\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:03:20\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_18:03:20 -- Epoch: 2/20; Valid; loss: 0.944; acc: 0.517; precision: 0.515, recall: 0.582, macrof1: 0.515, weightedf1: 0.515\u001b[0m\n",
      "\u001b[92m2020-09-10 18:03:20\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=4.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:03:21\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_18:03:21 -- Epoch: 3/20; Train; loss: 0.728; acc: 0.608; precision: 0.602, recall: 0.640, macrof1: 0.608, weightedf1: 0.608\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:03:28\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_18:03:28 -- Epoch: 3/20; Valid; loss: 0.872; acc: 0.536; precision: 0.530, recall: 0.634, macrof1: 0.531, weightedf1: 0.531\u001b[0m\n",
      "\u001b[92m2020-09-10 18:03:28\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=4.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:03:29\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_18:03:29 -- Epoch: 4/20; Train; loss: 0.643; acc: 0.644; precision: 0.630, recall: 0.696, macrof1: 0.643, weightedf1: 0.643\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:03:36\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_18:03:36 -- Epoch: 4/20; Valid; loss: 0.821; acc: 0.546; precision: 0.538, recall: 0.648, macrof1: 0.541, weightedf1: 0.541\u001b[0m\n",
      "\u001b[92m2020-09-10 18:03:36\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=4.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:03:37\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_18:03:37 -- Epoch: 5/20; Train; loss: 0.578; acc: 0.668; precision: 0.657, recall: 0.704, macrof1: 0.668, weightedf1: 0.668\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:03:44\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_18:03:44 -- Epoch: 5/20; Valid; loss: 0.786; acc: 0.555; precision: 0.547, recall: 0.643, macrof1: 0.551, weightedf1: 0.551\u001b[0m\n",
      "\u001b[92m2020-09-10 18:03:44\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=4.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:03:44\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_18:03:44 -- Epoch: 6/20; Train; loss: 0.531; acc: 0.716; precision: 0.705, recall: 0.744, macrof1: 0.716, weightedf1: 0.716\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:03:52\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_18:03:52 -- Epoch: 6/20; Valid; loss: 0.766; acc: 0.567; precision: 0.557, recall: 0.647, macrof1: 0.564, weightedf1: 0.564\u001b[0m\n",
      "\u001b[92m2020-09-10 18:03:52\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=4.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:03:52\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_18:03:52 -- Epoch: 7/20; Train; loss: 0.496; acc: 0.748; precision: 0.738, recall: 0.768, macrof1: 0.748, weightedf1: 0.748\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:04:00\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_18:04:00 -- Epoch: 7/20; Valid; loss: 0.752; acc: 0.575; precision: 0.566, recall: 0.643, macrof1: 0.573, weightedf1: 0.573\u001b[0m\n",
      "\u001b[92m2020-09-10 18:04:00\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=4.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:04:00\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_18:04:00 -- Epoch: 8/20; Train; loss: 0.468; acc: 0.772; precision: 0.774, recall: 0.768, macrof1: 0.772, weightedf1: 0.772\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:04:08\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_18:04:08 -- Epoch: 8/20; Valid; loss: 0.744; acc: 0.583; precision: 0.575, recall: 0.632, macrof1: 0.582, weightedf1: 0.582\u001b[0m\n",
      "\u001b[92m2020-09-10 18:04:08\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=4.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:04:08\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_18:04:08 -- Epoch: 9/20; Train; loss: 0.444; acc: 0.792; precision: 0.797, recall: 0.784, macrof1: 0.792, weightedf1: 0.792\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:04:16\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_18:04:16 -- Epoch: 9/20; Valid; loss: 0.740; acc: 0.592; precision: 0.585, recall: 0.630, macrof1: 0.591, weightedf1: 0.591\u001b[0m\n",
      "\u001b[92m2020-09-10 18:04:16\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=4.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:04:17\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_18:04:17 -- Epoch: 10/20; Train; loss: 0.415; acc: 0.812; precision: 0.831, recall: 0.784, macrof1: 0.812, weightedf1: 0.812\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:04:26\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_18:04:26 -- Epoch: 10/20; Valid; loss: 0.739; acc: 0.601; precision: 0.595, recall: 0.630, macrof1: 0.600, weightedf1: 0.600\u001b[0m\n",
      "\u001b[92m2020-09-10 18:04:26\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=4.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:04:26\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_18:04:26 -- Epoch: 11/20; Train; loss: 0.392; acc: 0.828; precision: 0.842, recall: 0.808, macrof1: 0.828, weightedf1: 0.828\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:04:34\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_18:04:34 -- Epoch: 11/20; Valid; loss: 0.743; acc: 0.610; precision: 0.603, recall: 0.644, macrof1: 0.609, weightedf1: 0.609\u001b[0m\n",
      "\u001b[92m2020-09-10 18:04:34\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model (early stopped) with least valid loss (checkpoint: 10) at ./models/wikigaz_en_ft_ocr_rnn_v001_n250/wikigaz_en_ft_ocr_rnn_v001_n250.model\u001b[0m\n",
      "\u001b[92m2020-09-10 18:04:35\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n",
      "\u001b[92m2020-09-10 18:04:35\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mEarly stopping at epoch: 11, selected epoch: 10\u001b[0m\n",
      "\n",
      "\n",
      "\n",
      "\n",
      "====================\n",
      "User time: 90.0004\n",
      "====================\n"
     ]
    }
   ],
   "source": [
    "from DeezyMatch import finetune as dm_finetune\n",
    "\n",
    "n_ft_examples = 250\n",
    "\n",
    "# fine-tune a pretrained model stored at pretrained_model_path and pretrained_vocab_path \n",
    "dm_finetune(input_file_path=\"./inputs/input_dfm_rnn_model_A.yaml\", \n",
    "            dataset_path=\"./dataset/ocr_trainval.txt\", \n",
    "            model_name=f\"wikigaz_en_ft_ocr_rnn_v001_n{n_ft_examples}\",\n",
    "            pretrained_model_path=\"./models/wikigaz_en_rnn_001/wikigaz_en_rnn_001.model\", \n",
    "            pretrained_vocab_path=\"./models/wikigaz_en_rnn_001/wikigaz_en_rnn_001.vocab\",\n",
    "            n_train_examples=n_ft_examples\n",
    "           )"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 36,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:04:35\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mread input file: ./inputs/input_dfm_rnn_model_A.yaml\u001b[0m\n",
      "\u001b[92m2020-09-10 18:04:35\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32mpytorch will use: cuda:1\u001b[0m\n",
      "\u001b[92m2020-09-10 18:04:35\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mread CSV file: ./dataset/ocr_trainval.txt\u001b[0m\n",
      "\u001b[92m2020-09-10 18:04:35\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32mnumber of labels, True: 42301 and False: 42302\u001b[0m\n",
      "\u001b[92m2020-09-10 18:04:35\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mSplitting the Dataset\u001b[0m\n",
      "\u001b[92m2020-09-10 18:04:35\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mfinish splitting the Dataset. User time: 0.054543495178222656\u001b[0m\n",
      "\u001b[92m2020-09-10 18:04:35\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msplits are as follow:\n",
      "not_assigned    58721\n",
      "val             25380\n",
      "train             500\n",
      "test                2\n",
      "Name: split, dtype: int64\u001b[0m\n",
      "\u001b[92m2020-09-10 18:04:35\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mstart creating a lookup table and convert characters to indices\u001b[0m\n",
      "\u001b[92m2020-09-10 18:04:35\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32m-- create vocabulary\u001b[0m\n",
      "\u001b[92m2020-09-10 18:04:36\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32m-- convert tokens to indices\u001b[0m\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "s1 padding:   0%|          | 0/25380 [00:00<?, ?it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:04:37\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mskipping 0 lines\u001b[0m\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "                                                                     \r"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n",
      "============================================================\n",
      "List all parameters in the model\n",
      "============================================================\n",
      "emb.weight False\n",
      "rnn_1.weight_ih_l0 False\n",
      "rnn_1.weight_hh_l0 False\n",
      "rnn_1.bias_ih_l0 False\n",
      "rnn_1.bias_hh_l0 False\n",
      "rnn_1.weight_ih_l0_reverse False\n",
      "rnn_1.weight_hh_l0_reverse False\n",
      "rnn_1.bias_ih_l0_reverse False\n",
      "rnn_1.bias_hh_l0_reverse False\n",
      "rnn_1.weight_ih_l1 False\n",
      "rnn_1.weight_hh_l1 False\n",
      "rnn_1.bias_ih_l1 False\n",
      "rnn_1.bias_hh_l1 False\n",
      "rnn_1.weight_ih_l1_reverse False\n",
      "rnn_1.weight_hh_l1_reverse False\n",
      "rnn_1.bias_ih_l1_reverse False\n",
      "rnn_1.bias_hh_l1_reverse False\n",
      "attn_step1.weight False\n",
      "attn_step1.bias False\n",
      "attn_step2.weight False\n",
      "attn_step2.bias False\n",
      "fc1.weight True\n",
      "fc1.bias True\n",
      "fc2.weight True\n",
      "fc2.bias True\n",
      "============================================================\n",
      "\n",
      "\n",
      "\n",
      "\u001b[92m2020-09-10 18:04:38\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m******************************\u001b[0m\n",
      "\u001b[92m2020-09-10 18:04:38\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m**** (Bi-directional) RNN ****\u001b[0m\n",
      "\u001b[92m2020-09-10 18:04:38\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m******************************\u001b[0m\n",
      "\u001b[92m2020-09-10 18:04:38\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mNumber of batches: 8\u001b[0m\n",
      "\u001b[92m2020-09-10 18:04:38\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mNumber of epochs: 20\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "52866ef5ffd643ed8277ebdddb38e9bd",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=20.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=8.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n",
      "\n",
      "====================\n",
      "Total number of params: 611883\n",
      "\n",
      "two_parallel_rnns (\n",
      "  (emb): Embedding(7542, 60), weights=((7542, 60),), parameters=452520\n",
      "  (rnn_1): RNN(60, 60, num_layers=2, dropout=0.01, bidirectional=True), weights=((60, 60), (60, 60), (60,), (60,), (60, 60), (60, 60), (60,), (60,), (60, 120), (60, 60), (60,), (60,), (60, 120), (60, 60), (60,), (60,)), parameters=36480\n",
      "  (attn_step1): Linear(in_features=120, out_features=60, bias=True), weights=((60, 120), (60,)), parameters=7260\n",
      "  (attn_step2): Linear(in_features=60, out_features=1, bias=True), weights=((1, 60), (1,)), parameters=61\n",
      "  (fc1): Linear(in_features=960, out_features=120, bias=True), weights=((120, 960), (120,)), parameters=115320\n",
      "  (fc2): Linear(in_features=120, out_features=2, bias=True), weights=((2, 120), (2,)), parameters=242\n",
      ")\n",
      "====================\n",
      "\n",
      "\n",
      "\u001b[92m2020-09-10 18:04:38\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_18:04:38 -- Epoch: 1/20; Train; loss: 1.054; acc: 0.492; precision: 0.492, recall: 0.492, macrof1: 0.492, weightedf1: 0.492\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:04:47\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_18:04:47 -- Epoch: 1/20; Valid; loss: 0.930; acc: 0.511; precision: 0.510, recall: 0.551, macrof1: 0.510, weightedf1: 0.510\u001b[0m\n",
      "\u001b[92m2020-09-10 18:04:47\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=8.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:04:47\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_18:04:47 -- Epoch: 2/20; Train; loss: 0.804; acc: 0.554; precision: 0.552, recall: 0.576, macrof1: 0.554, weightedf1: 0.554\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:04:56\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_18:04:56 -- Epoch: 2/20; Valid; loss: 0.791; acc: 0.534; precision: 0.530, recall: 0.614, macrof1: 0.531, weightedf1: 0.531\u001b[0m\n",
      "\u001b[92m2020-09-10 18:04:56\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=8.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:04:56\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_18:04:56 -- Epoch: 3/20; Train; loss: 0.687; acc: 0.592; precision: 0.584, recall: 0.640, macrof1: 0.591, weightedf1: 0.591\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:05:05\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_18:05:05 -- Epoch: 3/20; Valid; loss: 0.728; acc: 0.554; precision: 0.545, recall: 0.664, macrof1: 0.549, weightedf1: 0.549\u001b[0m\n",
      "\u001b[92m2020-09-10 18:05:05\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=8.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:05:05\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_18:05:05 -- Epoch: 4/20; Train; loss: 0.626; acc: 0.626; precision: 0.613, recall: 0.684, macrof1: 0.625, weightedf1: 0.625\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:05:14\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_18:05:14 -- Epoch: 4/20; Valid; loss: 0.694; acc: 0.573; precision: 0.560, recall: 0.676, macrof1: 0.568, weightedf1: 0.568\u001b[0m\n",
      "\u001b[92m2020-09-10 18:05:14\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=8.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:05:14\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_18:05:14 -- Epoch: 5/20; Train; loss: 0.587; acc: 0.666; precision: 0.655, recall: 0.700, macrof1: 0.666, weightedf1: 0.666\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:05:22\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_18:05:22 -- Epoch: 5/20; Valid; loss: 0.673; acc: 0.593; precision: 0.580, recall: 0.674, macrof1: 0.591, weightedf1: 0.591\u001b[0m\n",
      "\u001b[92m2020-09-10 18:05:22\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=8.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:05:22\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_18:05:22 -- Epoch: 6/20; Train; loss: 0.551; acc: 0.684; precision: 0.676, recall: 0.708, macrof1: 0.684, weightedf1: 0.684\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:05:30\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_18:05:30 -- Epoch: 6/20; Valid; loss: 0.660; acc: 0.618; precision: 0.608, recall: 0.668, macrof1: 0.617, weightedf1: 0.617\u001b[0m\n",
      "\u001b[92m2020-09-10 18:05:30\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=8.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:05:30\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_18:05:30 -- Epoch: 7/20; Train; loss: 0.520; acc: 0.726; precision: 0.738, recall: 0.700, macrof1: 0.726, weightedf1: 0.726\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:05:38\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_18:05:38 -- Epoch: 7/20; Valid; loss: 0.649; acc: 0.644; precision: 0.643, recall: 0.647, macrof1: 0.644, weightedf1: 0.644\u001b[0m\n",
      "\u001b[92m2020-09-10 18:05:38\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=8.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:05:38\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_18:05:38 -- Epoch: 8/20; Train; loss: 0.486; acc: 0.734; precision: 0.751, recall: 0.700, macrof1: 0.734, weightedf1: 0.734\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:05:46\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_18:05:46 -- Epoch: 8/20; Valid; loss: 0.642; acc: 0.661; precision: 0.667, recall: 0.644, macrof1: 0.661, weightedf1: 0.661\u001b[0m\n",
      "\u001b[92m2020-09-10 18:05:46\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=8.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:05:46\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_18:05:46 -- Epoch: 9/20; Train; loss: 0.456; acc: 0.778; precision: 0.779, recall: 0.776, macrof1: 0.778, weightedf1: 0.778\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:05:54\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_18:05:54 -- Epoch: 9/20; Valid; loss: 0.638; acc: 0.671; precision: 0.669, recall: 0.677, macrof1: 0.671, weightedf1: 0.671\u001b[0m\n",
      "\u001b[92m2020-09-10 18:05:54\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=8.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:05:54\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_18:05:54 -- Epoch: 10/20; Train; loss: 0.421; acc: 0.798; precision: 0.802, recall: 0.792, macrof1: 0.798, weightedf1: 0.798\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:06:02\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_18:06:02 -- Epoch: 10/20; Valid; loss: 0.632; acc: 0.683; precision: 0.687, recall: 0.671, macrof1: 0.683, weightedf1: 0.683\u001b[0m\n",
      "\u001b[92m2020-09-10 18:06:02\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=8.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:06:02\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_18:06:02 -- Epoch: 11/20; Train; loss: 0.390; acc: 0.818; precision: 0.824, recall: 0.808, macrof1: 0.818, weightedf1: 0.818\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:06:10\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_18:06:10 -- Epoch: 11/20; Valid; loss: 0.631; acc: 0.688; precision: 0.690, recall: 0.683, macrof1: 0.688, weightedf1: 0.688\u001b[0m\n",
      "\u001b[92m2020-09-10 18:06:10\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=8.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:06:11\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_18:06:11 -- Epoch: 12/20; Train; loss: 0.357; acc: 0.842; precision: 0.841, recall: 0.844, macrof1: 0.842, weightedf1: 0.842\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:06:18\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_18:06:18 -- Epoch: 12/20; Valid; loss: 0.635; acc: 0.695; precision: 0.702, recall: 0.677, macrof1: 0.695, weightedf1: 0.695\u001b[0m\n",
      "\u001b[92m2020-09-10 18:06:18\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model (early stopped) with least valid loss (checkpoint: 11) at ./models/wikigaz_en_ft_ocr_rnn_v001_n500/wikigaz_en_ft_ocr_rnn_v001_n500.model\u001b[0m\n",
      "\u001b[92m2020-09-10 18:06:18\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n",
      "\u001b[92m2020-09-10 18:06:18\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mEarly stopping at epoch: 12, selected epoch: 11\u001b[0m\n",
      "\n",
      "\n",
      "\n",
      "\n",
      "====================\n",
      "User time: 100.9527\n",
      "====================\n"
     ]
    }
   ],
   "source": [
    "from DeezyMatch import finetune as dm_finetune\n",
    "\n",
    "n_ft_examples = 500\n",
    "\n",
    "# fine-tune a pretrained model stored at pretrained_model_path and pretrained_vocab_path \n",
    "dm_finetune(input_file_path=\"./inputs/input_dfm_rnn_model_A.yaml\", \n",
    "            dataset_path=\"./dataset/ocr_trainval.txt\", \n",
    "            model_name=f\"wikigaz_en_ft_ocr_rnn_v001_n{n_ft_examples}\",\n",
    "            pretrained_model_path=\"./models/wikigaz_en_rnn_001/wikigaz_en_rnn_001.model\", \n",
    "            pretrained_vocab_path=\"./models/wikigaz_en_rnn_001/wikigaz_en_rnn_001.vocab\",\n",
    "            n_train_examples=n_ft_examples\n",
    "           )"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 37,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:06:18\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mread input file: ./inputs/input_dfm_rnn_model_A.yaml\u001b[0m\n",
      "\u001b[92m2020-09-10 18:06:18\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32mpytorch will use: cuda:1\u001b[0m\n",
      "\u001b[92m2020-09-10 18:06:18\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mread CSV file: ./dataset/ocr_trainval.txt\u001b[0m\n",
      "\u001b[92m2020-09-10 18:06:19\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32mnumber of labels, True: 42301 and False: 42302\u001b[0m\n",
      "\u001b[92m2020-09-10 18:06:19\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mSplitting the Dataset\u001b[0m\n",
      "\u001b[92m2020-09-10 18:06:19\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mfinish splitting the Dataset. User time: 0.045389652252197266\u001b[0m\n",
      "\u001b[92m2020-09-10 18:06:19\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msplits are as follow:\n",
      "not_assigned    58221\n",
      "val             25380\n",
      "train            1000\n",
      "test                2\n",
      "Name: split, dtype: int64\u001b[0m\n",
      "\u001b[92m2020-09-10 18:06:19\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mstart creating a lookup table and convert characters to indices\u001b[0m\n",
      "\u001b[92m2020-09-10 18:06:19\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32m-- create vocabulary\u001b[0m\n",
      "\u001b[92m2020-09-10 18:06:20\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32m-- convert tokens to indices\u001b[0m\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "s1 padding:   0%|          | 0/25380 [00:00<?, ?it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:06:21\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mskipping 0 lines\u001b[0m\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "                                                                     \r"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n",
      "============================================================\n",
      "List all parameters in the model\n",
      "============================================================\n",
      "emb.weight False\n",
      "rnn_1.weight_ih_l0 False\n",
      "rnn_1.weight_hh_l0 False\n",
      "rnn_1.bias_ih_l0 False\n",
      "rnn_1.bias_hh_l0 False\n",
      "rnn_1.weight_ih_l0_reverse False\n",
      "rnn_1.weight_hh_l0_reverse False\n",
      "rnn_1.bias_ih_l0_reverse False\n",
      "rnn_1.bias_hh_l0_reverse False\n",
      "rnn_1.weight_ih_l1 False\n",
      "rnn_1.weight_hh_l1 False\n",
      "rnn_1.bias_ih_l1 False\n",
      "rnn_1.bias_hh_l1 False\n",
      "rnn_1.weight_ih_l1_reverse False\n",
      "rnn_1.weight_hh_l1_reverse False\n",
      "rnn_1.bias_ih_l1_reverse False\n",
      "rnn_1.bias_hh_l1_reverse False\n",
      "attn_step1.weight False\n",
      "attn_step1.bias False\n",
      "attn_step2.weight False\n",
      "attn_step2.bias False\n",
      "fc1.weight True\n",
      "fc1.bias True\n",
      "fc2.weight True\n",
      "fc2.bias True\n",
      "============================================================\n",
      "\n",
      "\n",
      "\n",
      "\u001b[92m2020-09-10 18:06:21\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m******************************\u001b[0m\n",
      "\u001b[92m2020-09-10 18:06:21\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m**** (Bi-directional) RNN ****\u001b[0m\n",
      "\u001b[92m2020-09-10 18:06:21\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m******************************\u001b[0m\n",
      "\u001b[92m2020-09-10 18:06:21\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mNumber of batches: 16\u001b[0m\n",
      "\u001b[92m2020-09-10 18:06:21\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mNumber of epochs: 20\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "279fe64c424f4294a7c52f965984075c",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=20.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=16.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n",
      "\n",
      "====================\n",
      "Total number of params: 611883\n",
      "\n",
      "two_parallel_rnns (\n",
      "  (emb): Embedding(7542, 60), weights=((7542, 60),), parameters=452520\n",
      "  (rnn_1): RNN(60, 60, num_layers=2, dropout=0.01, bidirectional=True), weights=((60, 60), (60, 60), (60,), (60,), (60, 60), (60, 60), (60,), (60,), (60, 120), (60, 60), (60,), (60,), (60, 120), (60, 60), (60,), (60,)), parameters=36480\n",
      "  (attn_step1): Linear(in_features=120, out_features=60, bias=True), weights=((60, 120), (60,)), parameters=7260\n",
      "  (attn_step2): Linear(in_features=60, out_features=1, bias=True), weights=((1, 60), (1,)), parameters=61\n",
      "  (fc1): Linear(in_features=960, out_features=120, bias=True), weights=((120, 960), (120,)), parameters=115320\n",
      "  (fc2): Linear(in_features=120, out_features=2, bias=True), weights=((2, 120), (2,)), parameters=242\n",
      ")\n",
      "====================\n",
      "\n",
      "\n",
      "\u001b[92m2020-09-10 18:06:22\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_18:06:22 -- Epoch: 1/20; Train; loss: 0.965; acc: 0.499; precision: 0.499, recall: 0.492, macrof1: 0.499, weightedf1: 0.499\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:06:30\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_18:06:30 -- Epoch: 1/20; Valid; loss: 0.803; acc: 0.529; precision: 0.525, recall: 0.601, macrof1: 0.526, weightedf1: 0.526\u001b[0m\n",
      "\u001b[92m2020-09-10 18:06:30\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=16.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:06:30\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_18:06:30 -- Epoch: 2/20; Train; loss: 0.689; acc: 0.591; precision: 0.578, recall: 0.674, macrof1: 0.588, weightedf1: 0.588\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:06:38\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_18:06:38 -- Epoch: 2/20; Valid; loss: 0.696; acc: 0.568; precision: 0.557, recall: 0.660, macrof1: 0.564, weightedf1: 0.564\u001b[0m\n",
      "\u001b[92m2020-09-10 18:06:38\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=16.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:06:38\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_18:06:38 -- Epoch: 3/20; Train; loss: 0.612; acc: 0.633; precision: 0.621, recall: 0.684, macrof1: 0.632, weightedf1: 0.632\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:06:46\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_18:06:46 -- Epoch: 3/20; Valid; loss: 0.656; acc: 0.614; precision: 0.607, recall: 0.650, macrof1: 0.614, weightedf1: 0.614\u001b[0m\n",
      "\u001b[92m2020-09-10 18:06:46\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=16.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:06:47\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_18:06:47 -- Epoch: 4/20; Train; loss: 0.564; acc: 0.694; precision: 0.690, recall: 0.704, macrof1: 0.694, weightedf1: 0.694\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:06:56\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_18:06:56 -- Epoch: 4/20; Valid; loss: 0.632; acc: 0.658; precision: 0.655, recall: 0.665, macrof1: 0.658, weightedf1: 0.658\u001b[0m\n",
      "\u001b[92m2020-09-10 18:06:56\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=16.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:06:56\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_18:06:56 -- Epoch: 5/20; Train; loss: 0.520; acc: 0.722; precision: 0.715, recall: 0.738, macrof1: 0.722, weightedf1: 0.722\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:07:05\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_18:07:05 -- Epoch: 5/20; Valid; loss: 0.614; acc: 0.683; precision: 0.689, recall: 0.665, macrof1: 0.683, weightedf1: 0.683\u001b[0m\n",
      "\u001b[92m2020-09-10 18:07:05\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=16.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:07:05\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_18:07:05 -- Epoch: 6/20; Train; loss: 0.479; acc: 0.755; precision: 0.775, recall: 0.718, macrof1: 0.755, weightedf1: 0.755\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:07:14\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_18:07:14 -- Epoch: 6/20; Valid; loss: 0.600; acc: 0.695; precision: 0.696, recall: 0.694, macrof1: 0.695, weightedf1: 0.695\u001b[0m\n",
      "\u001b[92m2020-09-10 18:07:14\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=16.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:07:15\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_18:07:15 -- Epoch: 7/20; Train; loss: 0.440; acc: 0.784; precision: 0.780, recall: 0.792, macrof1: 0.784, weightedf1: 0.784\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:07:24\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_18:07:24 -- Epoch: 7/20; Valid; loss: 0.593; acc: 0.706; precision: 0.710, recall: 0.696, macrof1: 0.706, weightedf1: 0.706\u001b[0m\n",
      "\u001b[92m2020-09-10 18:07:24\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=16.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:07:24\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_18:07:24 -- Epoch: 8/20; Train; loss: 0.393; acc: 0.820; precision: 0.817, recall: 0.824, macrof1: 0.820, weightedf1: 0.820\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:07:33\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_18:07:33 -- Epoch: 8/20; Valid; loss: 0.591; acc: 0.714; precision: 0.708, recall: 0.726, macrof1: 0.714, weightedf1: 0.714\u001b[0m\n",
      "\u001b[92m2020-09-10 18:07:33\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=16.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:07:33\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_18:07:33 -- Epoch: 9/20; Train; loss: 0.361; acc: 0.841; precision: 0.865, recall: 0.808, macrof1: 0.841, weightedf1: 0.841\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:07:42\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_18:07:42 -- Epoch: 9/20; Valid; loss: 0.585; acc: 0.721; precision: 0.713, recall: 0.742, macrof1: 0.721, weightedf1: 0.721\u001b[0m\n",
      "\u001b[92m2020-09-10 18:07:42\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=16.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:07:43\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_18:07:43 -- Epoch: 10/20; Train; loss: 0.327; acc: 0.872; precision: 0.852, recall: 0.900, macrof1: 0.872, weightedf1: 0.872\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:07:51\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_18:07:51 -- Epoch: 10/20; Valid; loss: 0.589; acc: 0.723; precision: 0.734, recall: 0.699, macrof1: 0.723, weightedf1: 0.723\u001b[0m\n",
      "\u001b[92m2020-09-10 18:07:51\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model (early stopped) with least valid loss (checkpoint: 9) at ./models/wikigaz_en_ft_ocr_rnn_v001_n1000/wikigaz_en_ft_ocr_rnn_v001_n1000.model\u001b[0m\n",
      "\u001b[92m2020-09-10 18:07:51\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n",
      "\u001b[92m2020-09-10 18:07:51\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mEarly stopping at epoch: 10, selected epoch: 9\u001b[0m\n",
      "\n",
      "\n",
      "\n",
      "\n",
      "====================\n",
      "User time: 89.8996\n",
      "====================\n"
     ]
    }
   ],
   "source": [
    "from DeezyMatch import finetune as dm_finetune\n",
    "\n",
    "n_ft_examples = 1000\n",
    "\n",
    "# fine-tune a pretrained model stored at pretrained_model_path and pretrained_vocab_path \n",
    "dm_finetune(input_file_path=\"./inputs/input_dfm_rnn_model_A.yaml\", \n",
    "            dataset_path=\"./dataset/ocr_trainval.txt\", \n",
    "            model_name=f\"wikigaz_en_ft_ocr_rnn_v001_n{n_ft_examples}\",\n",
    "            pretrained_model_path=\"./models/wikigaz_en_rnn_001/wikigaz_en_rnn_001.model\", \n",
    "            pretrained_vocab_path=\"./models/wikigaz_en_rnn_001/wikigaz_en_rnn_001.vocab\",\n",
    "            n_train_examples=n_ft_examples\n",
    "           )"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 38,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:07:51\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mread input file: ./inputs/input_dfm_rnn_model_A.yaml\u001b[0m\n",
      "\u001b[92m2020-09-10 18:07:51\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32mpytorch will use: cuda:1\u001b[0m\n",
      "\u001b[92m2020-09-10 18:07:51\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mread CSV file: ./dataset/ocr_trainval.txt\u001b[0m\n",
      "\u001b[92m2020-09-10 18:07:52\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32mnumber of labels, True: 42301 and False: 42302\u001b[0m\n",
      "\u001b[92m2020-09-10 18:07:52\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mSplitting the Dataset\u001b[0m\n",
      "\u001b[92m2020-09-10 18:07:52\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mfinish splitting the Dataset. User time: 0.05057549476623535\u001b[0m\n",
      "\u001b[92m2020-09-10 18:07:52\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msplits are as follow:\n",
      "not_assigned    57221\n",
      "val             25380\n",
      "train            2000\n",
      "test                2\n",
      "Name: split, dtype: int64\u001b[0m\n",
      "\u001b[92m2020-09-10 18:07:52\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mstart creating a lookup table and convert characters to indices\u001b[0m\n",
      "\u001b[92m2020-09-10 18:07:52\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32m-- create vocabulary\u001b[0m\n",
      "\u001b[92m2020-09-10 18:07:53\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32m-- convert tokens to indices\u001b[0m\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "s1 padding:   0%|          | 0/25380 [00:00<?, ?it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:07:54\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mskipping 0 lines\u001b[0m\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "                                                                     \r"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n",
      "============================================================\n",
      "List all parameters in the model\n",
      "============================================================\n",
      "emb.weight False\n",
      "rnn_1.weight_ih_l0 False\n",
      "rnn_1.weight_hh_l0 False\n",
      "rnn_1.bias_ih_l0 False\n",
      "rnn_1.bias_hh_l0 False\n",
      "rnn_1.weight_ih_l0_reverse False\n",
      "rnn_1.weight_hh_l0_reverse False\n",
      "rnn_1.bias_ih_l0_reverse False\n",
      "rnn_1.bias_hh_l0_reverse False\n",
      "rnn_1.weight_ih_l1 False\n",
      "rnn_1.weight_hh_l1 False\n",
      "rnn_1.bias_ih_l1 False\n",
      "rnn_1.bias_hh_l1 False\n",
      "rnn_1.weight_ih_l1_reverse False\n",
      "rnn_1.weight_hh_l1_reverse False\n",
      "rnn_1.bias_ih_l1_reverse False\n",
      "rnn_1.bias_hh_l1_reverse False\n",
      "attn_step1.weight False\n",
      "attn_step1.bias False\n",
      "attn_step2.weight False\n",
      "attn_step2.bias False\n",
      "fc1.weight True\n",
      "fc1.bias True\n",
      "fc2.weight True\n",
      "fc2.bias True\n",
      "============================================================\n",
      "\n",
      "\n",
      "\n",
      "\u001b[92m2020-09-10 18:07:54\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m******************************\u001b[0m\n",
      "\u001b[92m2020-09-10 18:07:54\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m**** (Bi-directional) RNN ****\u001b[0m\n",
      "\u001b[92m2020-09-10 18:07:54\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m******************************\u001b[0m\n",
      "\u001b[92m2020-09-10 18:07:54\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mNumber of batches: 32\u001b[0m\n",
      "\u001b[92m2020-09-10 18:07:54\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mNumber of epochs: 20\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "81dfd3375dbe4744b90929e2ac888493",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=20.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=32.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n",
      "\n",
      "====================\n",
      "Total number of params: 611883\n",
      "\n",
      "two_parallel_rnns (\n",
      "  (emb): Embedding(7542, 60), weights=((7542, 60),), parameters=452520\n",
      "  (rnn_1): RNN(60, 60, num_layers=2, dropout=0.01, bidirectional=True), weights=((60, 60), (60, 60), (60,), (60,), (60, 60), (60, 60), (60,), (60,), (60, 120), (60, 60), (60,), (60,), (60, 120), (60, 60), (60,), (60,)), parameters=36480\n",
      "  (attn_step1): Linear(in_features=120, out_features=60, bias=True), weights=((60, 120), (60,)), parameters=7260\n",
      "  (attn_step2): Linear(in_features=60, out_features=1, bias=True), weights=((1, 60), (1,)), parameters=61\n",
      "  (fc1): Linear(in_features=960, out_features=120, bias=True), weights=((120, 960), (120,)), parameters=115320\n",
      "  (fc2): Linear(in_features=120, out_features=2, bias=True), weights=((2, 120), (2,)), parameters=242\n",
      ")\n",
      "====================\n",
      "\n",
      "\n",
      "\u001b[92m2020-09-10 18:07:55\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_18:07:55 -- Epoch: 1/20; Train; loss: 0.814; acc: 0.550; precision: 0.545, recall: 0.607, macrof1: 0.549, weightedf1: 0.549\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:08:03\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_18:08:03 -- Epoch: 1/20; Valid; loss: 0.676; acc: 0.595; precision: 0.581, recall: 0.684, macrof1: 0.592, weightedf1: 0.592\u001b[0m\n",
      "\u001b[92m2020-09-10 18:08:03\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=32.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:08:04\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_18:08:04 -- Epoch: 2/20; Train; loss: 0.602; acc: 0.667; precision: 0.652, recall: 0.717, macrof1: 0.666, weightedf1: 0.666\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:08:11\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_18:08:11 -- Epoch: 2/20; Valid; loss: 0.603; acc: 0.691; precision: 0.698, recall: 0.671, macrof1: 0.690, weightedf1: 0.690\u001b[0m\n",
      "\u001b[92m2020-09-10 18:08:11\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=32.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:08:12\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_18:08:12 -- Epoch: 3/20; Train; loss: 0.531; acc: 0.723; precision: 0.717, recall: 0.736, macrof1: 0.723, weightedf1: 0.723\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:08:20\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_18:08:20 -- Epoch: 3/20; Valid; loss: 0.574; acc: 0.714; precision: 0.736, recall: 0.667, macrof1: 0.713, weightedf1: 0.713\u001b[0m\n",
      "\u001b[92m2020-09-10 18:08:20\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=32.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:08:21\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_18:08:21 -- Epoch: 4/20; Train; loss: 0.486; acc: 0.756; precision: 0.753, recall: 0.761, macrof1: 0.756, weightedf1: 0.756\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:08:29\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_18:08:29 -- Epoch: 4/20; Valid; loss: 0.554; acc: 0.733; precision: 0.719, recall: 0.766, macrof1: 0.733, weightedf1: 0.733\u001b[0m\n",
      "\u001b[92m2020-09-10 18:08:29\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=32.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:08:30\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_18:08:30 -- Epoch: 5/20; Train; loss: 0.442; acc: 0.785; precision: 0.777, recall: 0.799, macrof1: 0.785, weightedf1: 0.785\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:08:38\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_18:08:38 -- Epoch: 5/20; Valid; loss: 0.543; acc: 0.743; precision: 0.718, recall: 0.801, macrof1: 0.743, weightedf1: 0.743\u001b[0m\n",
      "\u001b[92m2020-09-10 18:08:38\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=32.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:08:39\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_18:08:39 -- Epoch: 6/20; Train; loss: 0.402; acc: 0.809; precision: 0.800, recall: 0.824, macrof1: 0.809, weightedf1: 0.809\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:08:46\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_18:08:46 -- Epoch: 6/20; Valid; loss: 0.545; acc: 0.747; precision: 0.714, recall: 0.826, macrof1: 0.746, weightedf1: 0.746\u001b[0m\n",
      "\u001b[92m2020-09-10 18:08:46\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model (early stopped) with least valid loss (checkpoint: 5) at ./models/wikigaz_en_ft_ocr_rnn_v001_n2000/wikigaz_en_ft_ocr_rnn_v001_n2000.model\u001b[0m\n",
      "\u001b[92m2020-09-10 18:08:46\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n",
      "\u001b[92m2020-09-10 18:08:46\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mEarly stopping at epoch: 6, selected epoch: 5\u001b[0m\n",
      "\n",
      "\n",
      "\n",
      "\n",
      "====================\n",
      "User time: 52.4146\n",
      "====================\n"
     ]
    }
   ],
   "source": [
    "from DeezyMatch import finetune as dm_finetune\n",
    "\n",
    "n_ft_examples = 2000\n",
    "\n",
    "# fine-tune a pretrained model stored at pretrained_model_path and pretrained_vocab_path \n",
    "dm_finetune(input_file_path=\"./inputs/input_dfm_rnn_model_A.yaml\", \n",
    "            dataset_path=\"./dataset/ocr_trainval.txt\", \n",
    "            model_name=f\"wikigaz_en_ft_ocr_rnn_v001_n{n_ft_examples}\",\n",
    "            pretrained_model_path=\"./models/wikigaz_en_rnn_001/wikigaz_en_rnn_001.model\", \n",
    "            pretrained_vocab_path=\"./models/wikigaz_en_rnn_001/wikigaz_en_rnn_001.vocab\",\n",
    "            n_train_examples=n_ft_examples\n",
    "           )"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 39,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:08:46\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mread input file: ./inputs/input_dfm_rnn_model_A.yaml\u001b[0m\n",
      "\u001b[92m2020-09-10 18:08:46\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32mpytorch will use: cuda:1\u001b[0m\n",
      "\u001b[92m2020-09-10 18:08:46\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mread CSV file: ./dataset/ocr_trainval.txt\u001b[0m\n",
      "\u001b[92m2020-09-10 18:08:47\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32mnumber of labels, True: 42301 and False: 42302\u001b[0m\n",
      "\u001b[92m2020-09-10 18:08:47\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mSplitting the Dataset\u001b[0m\n",
      "\u001b[92m2020-09-10 18:08:47\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mfinish splitting the Dataset. User time: 0.05385422706604004\u001b[0m\n",
      "\u001b[92m2020-09-10 18:08:47\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msplits are as follow:\n",
      "not_assigned    55221\n",
      "val             25380\n",
      "train            4000\n",
      "test                2\n",
      "Name: split, dtype: int64\u001b[0m\n",
      "\u001b[92m2020-09-10 18:08:47\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mstart creating a lookup table and convert characters to indices\u001b[0m\n",
      "\u001b[92m2020-09-10 18:08:47\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32m-- create vocabulary\u001b[0m\n",
      "\u001b[92m2020-09-10 18:08:48\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32m-- convert tokens to indices\u001b[0m\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "length s1:   0%|          | 0/25380 [00:00<?, ?it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:08:49\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mskipping 0 lines\u001b[0m\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "                                                                     \r"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n",
      "============================================================\n",
      "List all parameters in the model\n",
      "============================================================\n",
      "emb.weight False\n",
      "rnn_1.weight_ih_l0 False\n",
      "rnn_1.weight_hh_l0 False\n",
      "rnn_1.bias_ih_l0 False\n",
      "rnn_1.bias_hh_l0 False\n",
      "rnn_1.weight_ih_l0_reverse False\n",
      "rnn_1.weight_hh_l0_reverse False\n",
      "rnn_1.bias_ih_l0_reverse False\n",
      "rnn_1.bias_hh_l0_reverse False\n",
      "rnn_1.weight_ih_l1 False\n",
      "rnn_1.weight_hh_l1 False\n",
      "rnn_1.bias_ih_l1 False\n",
      "rnn_1.bias_hh_l1 False\n",
      "rnn_1.weight_ih_l1_reverse False\n",
      "rnn_1.weight_hh_l1_reverse False\n",
      "rnn_1.bias_ih_l1_reverse False\n",
      "rnn_1.bias_hh_l1_reverse False\n",
      "attn_step1.weight False\n",
      "attn_step1.bias False\n",
      "attn_step2.weight False\n",
      "attn_step2.bias False\n",
      "fc1.weight True\n",
      "fc1.bias True\n",
      "fc2.weight True\n",
      "fc2.bias True\n",
      "============================================================\n",
      "\n",
      "\n",
      "\n",
      "\u001b[92m2020-09-10 18:08:50\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m******************************\u001b[0m\n",
      "\u001b[92m2020-09-10 18:08:50\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m**** (Bi-directional) RNN ****\u001b[0m\n",
      "\u001b[92m2020-09-10 18:08:50\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m******************************\u001b[0m\n",
      "\u001b[92m2020-09-10 18:08:50\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mNumber of batches: 63\u001b[0m\n",
      "\u001b[92m2020-09-10 18:08:50\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mNumber of epochs: 20\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "8931a30d356f4c87b79b801cffc1fa14",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=20.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=63.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n",
      "\n",
      "====================\n",
      "Total number of params: 611883\n",
      "\n",
      "two_parallel_rnns (\n",
      "  (emb): Embedding(7542, 60), weights=((7542, 60),), parameters=452520\n",
      "  (rnn_1): RNN(60, 60, num_layers=2, dropout=0.01, bidirectional=True), weights=((60, 60), (60, 60), (60,), (60,), (60, 60), (60, 60), (60,), (60,), (60, 120), (60, 60), (60,), (60,), (60, 120), (60, 60), (60,), (60,)), parameters=36480\n",
      "  (attn_step1): Linear(in_features=120, out_features=60, bias=True), weights=((60, 120), (60,)), parameters=7260\n",
      "  (attn_step2): Linear(in_features=60, out_features=1, bias=True), weights=((1, 60), (1,)), parameters=61\n",
      "  (fc1): Linear(in_features=960, out_features=120, bias=True), weights=((120, 960), (120,)), parameters=115320\n",
      "  (fc2): Linear(in_features=120, out_features=2, bias=True), weights=((2, 120), (2,)), parameters=242\n",
      ")\n",
      "====================\n",
      "\n",
      "\n",
      "\u001b[92m2020-09-10 18:08:51\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_18:08:51 -- Epoch: 1/20; Train; loss: 0.742; acc: 0.588; precision: 0.582, recall: 0.629, macrof1: 0.588, weightedf1: 0.588\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:08:59\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_18:08:59 -- Epoch: 1/20; Valid; loss: 0.610; acc: 0.679; precision: 0.689, recall: 0.653, macrof1: 0.679, weightedf1: 0.679\u001b[0m\n",
      "\u001b[92m2020-09-10 18:08:59\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=63.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:09:01\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_18:09:01 -- Epoch: 2/20; Train; loss: 0.548; acc: 0.720; precision: 0.723, recall: 0.714, macrof1: 0.720, weightedf1: 0.720\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:09:09\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_18:09:09 -- Epoch: 2/20; Valid; loss: 0.551; acc: 0.729; precision: 0.744, recall: 0.699, macrof1: 0.729, weightedf1: 0.729\u001b[0m\n",
      "\u001b[92m2020-09-10 18:09:09\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=63.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:09:11\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_18:09:11 -- Epoch: 3/20; Train; loss: 0.490; acc: 0.764; precision: 0.760, recall: 0.771, macrof1: 0.764, weightedf1: 0.764\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:09:19\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_18:09:19 -- Epoch: 3/20; Valid; loss: 0.524; acc: 0.748; precision: 0.710, recall: 0.837, macrof1: 0.746, weightedf1: 0.746\u001b[0m\n",
      "\u001b[92m2020-09-10 18:09:19\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=63.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:09:21\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_18:09:21 -- Epoch: 4/20; Train; loss: 0.438; acc: 0.802; precision: 0.791, recall: 0.822, macrof1: 0.802, weightedf1: 0.802\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:09:29\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_18:09:29 -- Epoch: 4/20; Valid; loss: 0.496; acc: 0.769; precision: 0.775, recall: 0.758, macrof1: 0.769, weightedf1: 0.769\u001b[0m\n",
      "\u001b[92m2020-09-10 18:09:29\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=63.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:09:31\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_18:09:31 -- Epoch: 5/20; Train; loss: 0.389; acc: 0.826; precision: 0.810, recall: 0.851, macrof1: 0.826, weightedf1: 0.826\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:09:39\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_18:09:39 -- Epoch: 5/20; Valid; loss: 0.481; acc: 0.777; precision: 0.774, recall: 0.782, macrof1: 0.777, weightedf1: 0.777\u001b[0m\n",
      "\u001b[92m2020-09-10 18:09:39\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=63.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:09:41\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_18:09:41 -- Epoch: 6/20; Train; loss: 0.347; acc: 0.853; precision: 0.837, recall: 0.875, macrof1: 0.852, weightedf1: 0.852\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:09:49\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_18:09:49 -- Epoch: 6/20; Valid; loss: 0.483; acc: 0.781; precision: 0.796, recall: 0.754, macrof1: 0.781, weightedf1: 0.781\u001b[0m\n",
      "\u001b[92m2020-09-10 18:09:49\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model (early stopped) with least valid loss (checkpoint: 5) at ./models/wikigaz_en_ft_ocr_rnn_v001_n4000/wikigaz_en_ft_ocr_rnn_v001_n4000.model\u001b[0m\n",
      "\u001b[92m2020-09-10 18:09:49\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n",
      "\u001b[92m2020-09-10 18:09:49\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mEarly stopping at epoch: 6, selected epoch: 5\u001b[0m\n",
      "\n",
      "\n",
      "\n",
      "\n",
      "====================\n",
      "User time: 59.4692\n",
      "====================\n"
     ]
    }
   ],
   "source": [
    "from DeezyMatch import finetune as dm_finetune\n",
    "\n",
    "n_ft_examples = 4000\n",
    "\n",
    "# fine-tune a pretrained model stored at pretrained_model_path and pretrained_vocab_path \n",
    "dm_finetune(input_file_path=\"./inputs/input_dfm_rnn_model_A.yaml\", \n",
    "            dataset_path=\"./dataset/ocr_trainval.txt\", \n",
    "            model_name=f\"wikigaz_en_ft_ocr_rnn_v001_n{n_ft_examples}\",\n",
    "            pretrained_model_path=\"./models/wikigaz_en_rnn_001/wikigaz_en_rnn_001.model\", \n",
    "            pretrained_vocab_path=\"./models/wikigaz_en_rnn_001/wikigaz_en_rnn_001.vocab\",\n",
    "            n_train_examples=n_ft_examples\n",
    "           )"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 40,
   "metadata": {
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:09:49\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mread input file: ./inputs/input_dfm_rnn_model_A.yaml\u001b[0m\n",
      "\u001b[92m2020-09-10 18:09:49\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32mpytorch will use: cuda:1\u001b[0m\n",
      "\u001b[92m2020-09-10 18:09:49\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mread CSV file: ./dataset/ocr_trainval.txt\u001b[0m\n",
      "\u001b[92m2020-09-10 18:09:50\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32mnumber of labels, True: 42301 and False: 42302\u001b[0m\n",
      "\u001b[92m2020-09-10 18:09:50\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mSplitting the Dataset\u001b[0m\n",
      "\u001b[92m2020-09-10 18:09:50\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mfinish splitting the Dataset. User time: 0.05468893051147461\u001b[0m\n",
      "\u001b[92m2020-09-10 18:09:50\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msplits are as follow:\n",
      "not_assigned    51221\n",
      "val             25380\n",
      "train            8000\n",
      "test                2\n",
      "Name: split, dtype: int64\u001b[0m\n",
      "\u001b[92m2020-09-10 18:09:50\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mstart creating a lookup table and convert characters to indices\u001b[0m\n",
      "\u001b[92m2020-09-10 18:09:50\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32m-- create vocabulary\u001b[0m\n",
      "\u001b[92m2020-09-10 18:09:51\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32m-- convert tokens to indices\u001b[0m\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "                                                    "
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:09:51\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mskipping 0 lines\u001b[0m\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "                                                                     \r"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n",
      "============================================================\n",
      "List all parameters in the model\n",
      "============================================================\n",
      "emb.weight False\n",
      "rnn_1.weight_ih_l0 False\n",
      "rnn_1.weight_hh_l0 False\n",
      "rnn_1.bias_ih_l0 False\n",
      "rnn_1.bias_hh_l0 False\n",
      "rnn_1.weight_ih_l0_reverse False\n",
      "rnn_1.weight_hh_l0_reverse False\n",
      "rnn_1.bias_ih_l0_reverse False\n",
      "rnn_1.bias_hh_l0_reverse False\n",
      "rnn_1.weight_ih_l1 False\n",
      "rnn_1.weight_hh_l1 False\n",
      "rnn_1.bias_ih_l1 False\n",
      "rnn_1.bias_hh_l1 False\n",
      "rnn_1.weight_ih_l1_reverse False\n",
      "rnn_1.weight_hh_l1_reverse False\n",
      "rnn_1.bias_ih_l1_reverse False\n",
      "rnn_1.bias_hh_l1_reverse False\n",
      "attn_step1.weight False\n",
      "attn_step1.bias False\n",
      "attn_step2.weight False\n",
      "attn_step2.bias False\n",
      "fc1.weight True\n",
      "fc1.bias True\n",
      "fc2.weight True\n",
      "fc2.bias True\n",
      "============================================================\n",
      "\n",
      "\n",
      "\n",
      "\u001b[92m2020-09-10 18:09:52\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m******************************\u001b[0m\n",
      "\u001b[92m2020-09-10 18:09:52\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m**** (Bi-directional) RNN ****\u001b[0m\n",
      "\u001b[92m2020-09-10 18:09:52\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m******************************\u001b[0m\n",
      "\u001b[92m2020-09-10 18:09:52\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mNumber of batches: 125\u001b[0m\n",
      "\u001b[92m2020-09-10 18:09:52\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mNumber of epochs: 20\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "609beed9335c4673b355280b1df34ecb",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=20.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=125.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n",
      "\n",
      "====================\n",
      "Total number of params: 611883\n",
      "\n",
      "two_parallel_rnns (\n",
      "  (emb): Embedding(7542, 60), weights=((7542, 60),), parameters=452520\n",
      "  (rnn_1): RNN(60, 60, num_layers=2, dropout=0.01, bidirectional=True), weights=((60, 60), (60, 60), (60,), (60,), (60, 60), (60, 60), (60,), (60,), (60, 120), (60, 60), (60,), (60,), (60, 120), (60, 60), (60,), (60,)), parameters=36480\n",
      "  (attn_step1): Linear(in_features=120, out_features=60, bias=True), weights=((60, 120), (60,)), parameters=7260\n",
      "  (attn_step2): Linear(in_features=60, out_features=1, bias=True), weights=((1, 60), (1,)), parameters=61\n",
      "  (fc1): Linear(in_features=960, out_features=120, bias=True), weights=((120, 960), (120,)), parameters=115320\n",
      "  (fc2): Linear(in_features=120, out_features=2, bias=True), weights=((2, 120), (2,)), parameters=242\n",
      ")\n",
      "====================\n",
      "\n",
      "\n",
      "\u001b[92m2020-09-10 18:09:56\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_18:09:56 -- Epoch: 1/20; Train; loss: 0.649; acc: 0.654; precision: 0.645, recall: 0.686, macrof1: 0.654, weightedf1: 0.654\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:10:04\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_18:10:04 -- Epoch: 1/20; Valid; loss: 0.534; acc: 0.741; precision: 0.725, recall: 0.774, macrof1: 0.740, weightedf1: 0.740\u001b[0m\n",
      "\u001b[92m2020-09-10 18:10:04\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=125.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:10:08\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_18:10:08 -- Epoch: 2/20; Train; loss: 0.487; acc: 0.769; precision: 0.756, recall: 0.795, macrof1: 0.769, weightedf1: 0.769\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:10:16\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_18:10:16 -- Epoch: 2/20; Valid; loss: 0.487; acc: 0.768; precision: 0.726, recall: 0.861, macrof1: 0.766, weightedf1: 0.766\u001b[0m\n",
      "\u001b[92m2020-09-10 18:10:16\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=125.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:10:19\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_18:10:19 -- Epoch: 3/20; Train; loss: 0.425; acc: 0.805; precision: 0.790, recall: 0.833, macrof1: 0.805, weightedf1: 0.805\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:10:27\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_18:10:27 -- Epoch: 3/20; Valid; loss: 0.444; acc: 0.797; precision: 0.778, recall: 0.831, macrof1: 0.797, weightedf1: 0.797\u001b[0m\n",
      "\u001b[92m2020-09-10 18:10:27\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=125.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:10:31\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_18:10:31 -- Epoch: 4/20; Train; loss: 0.372; acc: 0.835; precision: 0.816, recall: 0.864, macrof1: 0.835, weightedf1: 0.835\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:10:39\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_18:10:39 -- Epoch: 4/20; Valid; loss: 0.429; acc: 0.807; precision: 0.792, recall: 0.832, macrof1: 0.807, weightedf1: 0.807\u001b[0m\n",
      "\u001b[92m2020-09-10 18:10:39\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=125.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:10:43\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_18:10:43 -- Epoch: 5/20; Train; loss: 0.326; acc: 0.863; precision: 0.846, recall: 0.886, macrof1: 0.862, weightedf1: 0.862\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:10:51\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_18:10:51 -- Epoch: 5/20; Valid; loss: 0.426; acc: 0.812; precision: 0.795, recall: 0.841, macrof1: 0.812, weightedf1: 0.812\u001b[0m\n",
      "\u001b[92m2020-09-10 18:10:51\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=125.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:10:54\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_18:10:54 -- Epoch: 6/20; Train; loss: 0.291; acc: 0.879; precision: 0.864, recall: 0.898, macrof1: 0.879, weightedf1: 0.879\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:11:02\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_18:11:02 -- Epoch: 6/20; Valid; loss: 0.435; acc: 0.810; precision: 0.767, recall: 0.889, macrof1: 0.808, weightedf1: 0.808\u001b[0m\n",
      "\u001b[92m2020-09-10 18:11:02\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model (early stopped) with least valid loss (checkpoint: 5) at ./models/wikigaz_en_ft_ocr_rnn_v001_n8000/wikigaz_en_ft_ocr_rnn_v001_n8000.model\u001b[0m\n",
      "\u001b[92m2020-09-10 18:11:02\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n",
      "\u001b[92m2020-09-10 18:11:02\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mEarly stopping at epoch: 6, selected epoch: 5\u001b[0m\n",
      "\n",
      "\n",
      "\n",
      "\n",
      "====================\n",
      "User time: 70.3708\n",
      "====================\n"
     ]
    }
   ],
   "source": [
    "from DeezyMatch import finetune as dm_finetune\n",
    "\n",
    "n_ft_examples = 8000\n",
    "\n",
    "# fine-tune a pretrained model stored at pretrained_model_path and pretrained_vocab_path \n",
    "dm_finetune(input_file_path=\"./inputs/input_dfm_rnn_model_A.yaml\", \n",
    "            dataset_path=\"./dataset/ocr_trainval.txt\", \n",
    "            model_name=f\"wikigaz_en_ft_ocr_rnn_v001_n{n_ft_examples}\",\n",
    "            pretrained_model_path=\"./models/wikigaz_en_rnn_001/wikigaz_en_rnn_001.model\", \n",
    "            pretrained_vocab_path=\"./models/wikigaz_en_rnn_001/wikigaz_en_rnn_001.vocab\",\n",
    "            n_train_examples=n_ft_examples\n",
    "           )"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 41,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:11:02\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mread input file: ./inputs/input_dfm_rnn_model_A.yaml\u001b[0m\n",
      "\u001b[92m2020-09-10 18:11:02\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32mpytorch will use: cuda:1\u001b[0m\n",
      "\u001b[92m2020-09-10 18:11:02\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mread CSV file: ./dataset/ocr_trainval.txt\u001b[0m\n",
      "\u001b[92m2020-09-10 18:11:03\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32mnumber of labels, True: 42301 and False: 42302\u001b[0m\n",
      "\u001b[92m2020-09-10 18:11:03\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mSplitting the Dataset\u001b[0m\n",
      "\u001b[92m2020-09-10 18:11:03\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mfinish splitting the Dataset. User time: 0.0556797981262207\u001b[0m\n",
      "\u001b[92m2020-09-10 18:11:03\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msplits are as follow:\n",
      "not_assigned    43221\n",
      "val             25380\n",
      "train           16000\n",
      "test                2\n",
      "Name: split, dtype: int64\u001b[0m\n",
      "\u001b[92m2020-09-10 18:11:03\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mstart creating a lookup table and convert characters to indices\u001b[0m\n",
      "\u001b[92m2020-09-10 18:11:03\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32m-- create vocabulary\u001b[0m\n",
      "\u001b[92m2020-09-10 18:11:04\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32m-- convert tokens to indices\u001b[0m\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "s1 padding:   0%|          | 0/16000 [00:00<?, ?it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:11:05\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mskipping 0 lines\u001b[0m\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "                                                                     \r"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n",
      "============================================================\n",
      "List all parameters in the model\n",
      "============================================================\n",
      "emb.weight False\n",
      "rnn_1.weight_ih_l0 False\n",
      "rnn_1.weight_hh_l0 False\n",
      "rnn_1.bias_ih_l0 False\n",
      "rnn_1.bias_hh_l0 False\n",
      "rnn_1.weight_ih_l0_reverse False\n",
      "rnn_1.weight_hh_l0_reverse False\n",
      "rnn_1.bias_ih_l0_reverse False\n",
      "rnn_1.bias_hh_l0_reverse False\n",
      "rnn_1.weight_ih_l1 False\n",
      "rnn_1.weight_hh_l1 False\n",
      "rnn_1.bias_ih_l1 False\n",
      "rnn_1.bias_hh_l1 False\n",
      "rnn_1.weight_ih_l1_reverse False\n",
      "rnn_1.weight_hh_l1_reverse False\n",
      "rnn_1.bias_ih_l1_reverse False\n",
      "rnn_1.bias_hh_l1_reverse False\n",
      "attn_step1.weight False\n",
      "attn_step1.bias False\n",
      "attn_step2.weight False\n",
      "attn_step2.bias False\n",
      "fc1.weight True\n",
      "fc1.bias True\n",
      "fc2.weight True\n",
      "fc2.bias True\n",
      "============================================================\n",
      "\n",
      "\n",
      "\n",
      "\u001b[92m2020-09-10 18:11:06\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m******************************\u001b[0m\n",
      "\u001b[92m2020-09-10 18:11:06\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m**** (Bi-directional) RNN ****\u001b[0m\n",
      "\u001b[92m2020-09-10 18:11:06\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m******************************\u001b[0m\n",
      "\u001b[92m2020-09-10 18:11:06\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mNumber of batches: 250\u001b[0m\n",
      "\u001b[92m2020-09-10 18:11:06\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mNumber of epochs: 20\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "2c47ed6f96ab4799891b3ee415f4adc2",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=20.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=250.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n",
      "\n",
      "====================\n",
      "Total number of params: 611883\n",
      "\n",
      "two_parallel_rnns (\n",
      "  (emb): Embedding(7542, 60), weights=((7542, 60),), parameters=452520\n",
      "  (rnn_1): RNN(60, 60, num_layers=2, dropout=0.01, bidirectional=True), weights=((60, 60), (60, 60), (60,), (60,), (60, 60), (60, 60), (60,), (60,), (60, 120), (60, 60), (60,), (60,), (60, 120), (60, 60), (60,), (60,)), parameters=36480\n",
      "  (attn_step1): Linear(in_features=120, out_features=60, bias=True), weights=((60, 120), (60,)), parameters=7260\n",
      "  (attn_step2): Linear(in_features=60, out_features=1, bias=True), weights=((1, 60), (1,)), parameters=61\n",
      "  (fc1): Linear(in_features=960, out_features=120, bias=True), weights=((120, 960), (120,)), parameters=115320\n",
      "  (fc2): Linear(in_features=120, out_features=2, bias=True), weights=((2, 120), (2,)), parameters=242\n",
      ")\n",
      "====================\n",
      "\n",
      "\n",
      "\u001b[92m2020-09-10 18:11:13\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_18:11:13 -- Epoch: 1/20; Train; loss: 0.574; acc: 0.707; precision: 0.696, recall: 0.734, macrof1: 0.707, weightedf1: 0.707\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:11:21\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_18:11:21 -- Epoch: 1/20; Valid; loss: 0.479; acc: 0.777; precision: 0.760, recall: 0.807, macrof1: 0.776, weightedf1: 0.776\u001b[0m\n",
      "\u001b[92m2020-09-10 18:11:21\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=250.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:11:28\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_18:11:28 -- Epoch: 2/20; Train; loss: 0.428; acc: 0.803; precision: 0.790, recall: 0.827, macrof1: 0.803, weightedf1: 0.803\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:11:36\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_18:11:36 -- Epoch: 2/20; Valid; loss: 0.426; acc: 0.809; precision: 0.813, recall: 0.802, macrof1: 0.809, weightedf1: 0.809\u001b[0m\n",
      "\u001b[92m2020-09-10 18:11:36\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=250.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:11:43\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_18:11:43 -- Epoch: 3/20; Train; loss: 0.368; acc: 0.837; precision: 0.822, recall: 0.861, macrof1: 0.837, weightedf1: 0.837\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:11:51\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_18:11:51 -- Epoch: 3/20; Valid; loss: 0.397; acc: 0.825; precision: 0.800, recall: 0.867, macrof1: 0.825, weightedf1: 0.825\u001b[0m\n",
      "\u001b[92m2020-09-10 18:11:51\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=250.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:11:58\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_18:11:58 -- Epoch: 4/20; Train; loss: 0.323; acc: 0.863; precision: 0.849, recall: 0.884, macrof1: 0.863, weightedf1: 0.863\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:12:06\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_18:12:06 -- Epoch: 4/20; Valid; loss: 0.384; acc: 0.834; precision: 0.814, recall: 0.866, macrof1: 0.834, weightedf1: 0.834\u001b[0m\n",
      "\u001b[92m2020-09-10 18:12:07\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=250.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:12:14\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_18:12:14 -- Epoch: 5/20; Train; loss: 0.290; acc: 0.878; precision: 0.866, recall: 0.894, macrof1: 0.878, weightedf1: 0.878\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:12:22\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_18:12:22 -- Epoch: 5/20; Valid; loss: 0.384; acc: 0.834; precision: 0.799, recall: 0.894, macrof1: 0.834, weightedf1: 0.834\u001b[0m\n",
      "\u001b[92m2020-09-10 18:12:22\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=250.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:12:29\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_18:12:29 -- Epoch: 6/20; Train; loss: 0.259; acc: 0.893; precision: 0.879, recall: 0.912, macrof1: 0.893, weightedf1: 0.893\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:12:37\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_18:12:37 -- Epoch: 6/20; Valid; loss: 0.387; acc: 0.839; precision: 0.830, recall: 0.855, macrof1: 0.839, weightedf1: 0.839\u001b[0m\n",
      "\u001b[92m2020-09-10 18:12:37\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model (early stopped) with least valid loss (checkpoint: 5) at ./models/wikigaz_en_ft_ocr_rnn_v001_n16000/wikigaz_en_ft_ocr_rnn_v001_n16000.model\u001b[0m\n",
      "\u001b[92m2020-09-10 18:12:37\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n",
      "\u001b[92m2020-09-10 18:12:37\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mEarly stopping at epoch: 6, selected epoch: 5\u001b[0m\n",
      "\n",
      "\n",
      "\n",
      "\n",
      "====================\n",
      "User time: 91.5719\n",
      "====================\n"
     ]
    }
   ],
   "source": [
    "from DeezyMatch import finetune as dm_finetune\n",
    "\n",
    "n_ft_examples = 16000\n",
    "\n",
    "# fine-tune a pretrained model stored at pretrained_model_path and pretrained_vocab_path \n",
    "dm_finetune(input_file_path=\"./inputs/input_dfm_rnn_model_A.yaml\", \n",
    "            dataset_path=\"./dataset/ocr_trainval.txt\", \n",
    "            model_name=f\"wikigaz_en_ft_ocr_rnn_v001_n{n_ft_examples}\",\n",
    "            pretrained_model_path=\"./models/wikigaz_en_rnn_001/wikigaz_en_rnn_001.model\", \n",
    "            pretrained_vocab_path=\"./models/wikigaz_en_rnn_001/wikigaz_en_rnn_001.vocab\",\n",
    "            n_train_examples=n_ft_examples\n",
    "           )"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 42,
   "metadata": {
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:12:37\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mread input file: ./inputs/input_dfm_rnn_model_A.yaml\u001b[0m\n",
      "\u001b[92m2020-09-10 18:12:37\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32mpytorch will use: cuda:1\u001b[0m\n",
      "\u001b[92m2020-09-10 18:12:37\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mread CSV file: ./dataset/ocr_trainval.txt\u001b[0m\n",
      "\u001b[92m2020-09-10 18:12:38\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32mnumber of labels, True: 42301 and False: 42302\u001b[0m\n",
      "\u001b[92m2020-09-10 18:12:38\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mSplitting the Dataset\u001b[0m\n",
      "\u001b[92m2020-09-10 18:12:38\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mfinish splitting the Dataset. User time: 0.06780815124511719\u001b[0m\n",
      "\u001b[92m2020-09-10 18:12:38\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msplits are as follow:\n",
      "train           32000\n",
      "not_assigned    27221\n",
      "val             25380\n",
      "test                2\n",
      "Name: split, dtype: int64\u001b[0m\n",
      "\u001b[92m2020-09-10 18:12:38\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mstart creating a lookup table and convert characters to indices\u001b[0m\n",
      "\u001b[92m2020-09-10 18:12:38\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32m-- create vocabulary\u001b[0m\n",
      "\u001b[92m2020-09-10 18:12:39\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32m-- convert tokens to indices\u001b[0m\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "s1 padding:   0%|          | 0/32000 [00:00<?, ?it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:12:40\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mskipping 0 lines\u001b[0m\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "                                                                     \r"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n",
      "============================================================\n",
      "List all parameters in the model\n",
      "============================================================\n",
      "emb.weight False\n",
      "rnn_1.weight_ih_l0 False\n",
      "rnn_1.weight_hh_l0 False\n",
      "rnn_1.bias_ih_l0 False\n",
      "rnn_1.bias_hh_l0 False\n",
      "rnn_1.weight_ih_l0_reverse False\n",
      "rnn_1.weight_hh_l0_reverse False\n",
      "rnn_1.bias_ih_l0_reverse False\n",
      "rnn_1.bias_hh_l0_reverse False\n",
      "rnn_1.weight_ih_l1 False\n",
      "rnn_1.weight_hh_l1 False\n",
      "rnn_1.bias_ih_l1 False\n",
      "rnn_1.bias_hh_l1 False\n",
      "rnn_1.weight_ih_l1_reverse False\n",
      "rnn_1.weight_hh_l1_reverse False\n",
      "rnn_1.bias_ih_l1_reverse False\n",
      "rnn_1.bias_hh_l1_reverse False\n",
      "attn_step1.weight False\n",
      "attn_step1.bias False\n",
      "attn_step2.weight False\n",
      "attn_step2.bias False\n",
      "fc1.weight True\n",
      "fc1.bias True\n",
      "fc2.weight True\n",
      "fc2.bias True\n",
      "============================================================\n",
      "\n",
      "\n",
      "\n",
      "\u001b[92m2020-09-10 18:12:41\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m******************************\u001b[0m\n",
      "\u001b[92m2020-09-10 18:12:41\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m**** (Bi-directional) RNN ****\u001b[0m\n",
      "\u001b[92m2020-09-10 18:12:41\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m******************************\u001b[0m\n",
      "\u001b[92m2020-09-10 18:12:41\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mNumber of batches: 500\u001b[0m\n",
      "\u001b[92m2020-09-10 18:12:41\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mNumber of epochs: 20\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "b9a1110cb92b43fa923bbf6f48188903",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=20.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=500.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n",
      "\n",
      "====================\n",
      "Total number of params: 611883\n",
      "\n",
      "two_parallel_rnns (\n",
      "  (emb): Embedding(7542, 60), weights=((7542, 60),), parameters=452520\n",
      "  (rnn_1): RNN(60, 60, num_layers=2, dropout=0.01, bidirectional=True), weights=((60, 60), (60, 60), (60,), (60,), (60, 60), (60, 60), (60,), (60,), (60, 120), (60, 60), (60,), (60,), (60, 120), (60, 60), (60,), (60,)), parameters=36480\n",
      "  (attn_step1): Linear(in_features=120, out_features=60, bias=True), weights=((60, 120), (60,)), parameters=7260\n",
      "  (attn_step2): Linear(in_features=60, out_features=1, bias=True), weights=((1, 60), (1,)), parameters=61\n",
      "  (fc1): Linear(in_features=960, out_features=120, bias=True), weights=((120, 960), (120,)), parameters=115320\n",
      "  (fc2): Linear(in_features=120, out_features=2, bias=True), weights=((2, 120), (2,)), parameters=242\n",
      ")\n",
      "====================\n",
      "\n",
      "\n",
      "\u001b[92m2020-09-10 18:12:55\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_18:12:55 -- Epoch: 1/20; Train; loss: 0.509; acc: 0.753; precision: 0.743, recall: 0.775, macrof1: 0.753, weightedf1: 0.753\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:13:03\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_18:13:03 -- Epoch: 1/20; Valid; loss: 0.406; acc: 0.819; precision: 0.803, recall: 0.846, macrof1: 0.819, weightedf1: 0.819\u001b[0m\n",
      "\u001b[92m2020-09-10 18:13:03\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=500.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:13:17\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_18:13:17 -- Epoch: 2/20; Train; loss: 0.372; acc: 0.838; precision: 0.821, recall: 0.863, macrof1: 0.838, weightedf1: 0.838\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:13:25\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_18:13:25 -- Epoch: 2/20; Valid; loss: 0.367; acc: 0.838; precision: 0.822, recall: 0.863, macrof1: 0.838, weightedf1: 0.838\u001b[0m\n",
      "\u001b[92m2020-09-10 18:13:25\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=500.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:13:40\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_18:13:40 -- Epoch: 3/20; Train; loss: 0.327; acc: 0.861; precision: 0.845, recall: 0.884, macrof1: 0.861, weightedf1: 0.861\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:13:48\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_18:13:48 -- Epoch: 3/20; Valid; loss: 0.352; acc: 0.849; precision: 0.833, recall: 0.872, macrof1: 0.849, weightedf1: 0.849\u001b[0m\n",
      "\u001b[92m2020-09-10 18:13:48\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=500.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:14:02\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_18:14:02 -- Epoch: 4/20; Train; loss: 0.291; acc: 0.877; precision: 0.861, recall: 0.898, macrof1: 0.876, weightedf1: 0.876\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:14:10\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_18:14:10 -- Epoch: 4/20; Valid; loss: 0.344; acc: 0.852; precision: 0.845, recall: 0.863, macrof1: 0.852, weightedf1: 0.852\u001b[0m\n",
      "\u001b[92m2020-09-10 18:14:10\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=500.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:14:23\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_18:14:23 -- Epoch: 5/20; Train; loss: 0.264; acc: 0.890; precision: 0.876, recall: 0.909, macrof1: 0.890, weightedf1: 0.890\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:14:31\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_18:14:31 -- Epoch: 5/20; Valid; loss: 0.350; acc: 0.852; precision: 0.824, recall: 0.895, macrof1: 0.852, weightedf1: 0.852\u001b[0m\n",
      "\u001b[92m2020-09-10 18:14:31\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model (early stopped) with least valid loss (checkpoint: 4) at ./models/wikigaz_en_ft_ocr_rnn_v001_n32000/wikigaz_en_ft_ocr_rnn_v001_n32000.model\u001b[0m\n",
      "\u001b[92m2020-09-10 18:14:31\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n",
      "\u001b[92m2020-09-10 18:14:31\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mEarly stopping at epoch: 5, selected epoch: 4\u001b[0m\n",
      "\n",
      "\n",
      "\n",
      "\n",
      "====================\n",
      "User time: 110.5102\n",
      "====================\n"
     ]
    }
   ],
   "source": [
    "from DeezyMatch import finetune as dm_finetune\n",
    "\n",
    "n_ft_examples = 32000\n",
    "\n",
    "# fine-tune a pretrained model stored at pretrained_model_path and pretrained_vocab_path \n",
    "dm_finetune(input_file_path=\"./inputs/input_dfm_rnn_model_A.yaml\", \n",
    "            dataset_path=\"./dataset/ocr_trainval.txt\", \n",
    "            model_name=f\"wikigaz_en_ft_ocr_rnn_v001_n{n_ft_examples}\",\n",
    "            pretrained_model_path=\"./models/wikigaz_en_rnn_001/wikigaz_en_rnn_001.model\", \n",
    "            pretrained_vocab_path=\"./models/wikigaz_en_rnn_001/wikigaz_en_rnn_001.vocab\",\n",
    "            n_train_examples=n_ft_examples\n",
    "           )"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 43,
   "metadata": {
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:14:31\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mread input file: ./inputs/input_dfm_rnn_model_A_no_early_stopping.yaml\u001b[0m\n",
      "\u001b[92m2020-09-10 18:14:31\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32mpytorch will use: cuda:1\u001b[0m\n",
      "\u001b[92m2020-09-10 18:14:31\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mread CSV file: ./dataset/ocr_trainval.txt\u001b[0m\n",
      "\u001b[92m2020-09-10 18:14:32\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32mnumber of labels, True: 42301 and False: 42302\u001b[0m\n",
      "\u001b[92m2020-09-10 18:14:32\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mSplitting the Dataset\u001b[0m\n",
      "\u001b[92m2020-09-10 18:14:32\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mfinish splitting the Dataset. User time: 0.05356478691101074\u001b[0m\n",
      "\u001b[92m2020-09-10 18:14:32\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msplits are as follow:\n",
      "train    64000\n",
      "val      20603\n",
      "Name: split, dtype: int64\u001b[0m\n",
      "\u001b[92m2020-09-10 18:14:32\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mstart creating a lookup table and convert characters to indices\u001b[0m\n",
      "\u001b[92m2020-09-10 18:14:32\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32m-- create vocabulary\u001b[0m\n",
      "\u001b[92m2020-09-10 18:14:33\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32m-- convert tokens to indices\u001b[0m\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "length s2:   0%|          | 0/64000 [00:00<?, ?it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:14:34\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mskipping 0 lines\u001b[0m\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "                                                                     \r"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n",
      "============================================================\n",
      "List all parameters in the model\n",
      "============================================================\n",
      "emb.weight False\n",
      "rnn_1.weight_ih_l0 False\n",
      "rnn_1.weight_hh_l0 False\n",
      "rnn_1.bias_ih_l0 False\n",
      "rnn_1.bias_hh_l0 False\n",
      "rnn_1.weight_ih_l0_reverse False\n",
      "rnn_1.weight_hh_l0_reverse False\n",
      "rnn_1.bias_ih_l0_reverse False\n",
      "rnn_1.bias_hh_l0_reverse False\n",
      "rnn_1.weight_ih_l1 False\n",
      "rnn_1.weight_hh_l1 False\n",
      "rnn_1.bias_ih_l1 False\n",
      "rnn_1.bias_hh_l1 False\n",
      "rnn_1.weight_ih_l1_reverse False\n",
      "rnn_1.weight_hh_l1_reverse False\n",
      "rnn_1.bias_ih_l1_reverse False\n",
      "rnn_1.bias_hh_l1_reverse False\n",
      "attn_step1.weight False\n",
      "attn_step1.bias False\n",
      "attn_step2.weight False\n",
      "attn_step2.bias False\n",
      "fc1.weight True\n",
      "fc1.bias True\n",
      "fc2.weight True\n",
      "fc2.bias True\n",
      "============================================================\n",
      "\n",
      "\n",
      "\n",
      "\u001b[92m2020-09-10 18:14:35\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m******************************\u001b[0m\n",
      "\u001b[92m2020-09-10 18:14:35\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m**** (Bi-directional) RNN ****\u001b[0m\n",
      "\u001b[92m2020-09-10 18:14:35\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m******************************\u001b[0m\n",
      "\u001b[92m2020-09-10 18:14:35\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mNumber of batches: 1000\u001b[0m\n",
      "\u001b[92m2020-09-10 18:14:35\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mNumber of epochs: 10\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "af3760c4b8ce4f63825683bf0dfb6b38",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=10.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=1000.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n",
      "\n",
      "====================\n",
      "Total number of params: 611883\n",
      "\n",
      "two_parallel_rnns (\n",
      "  (emb): Embedding(7542, 60), weights=((7542, 60),), parameters=452520\n",
      "  (rnn_1): RNN(60, 60, num_layers=2, dropout=0.01, bidirectional=True), weights=((60, 60), (60, 60), (60,), (60,), (60, 60), (60, 60), (60,), (60,), (60, 120), (60, 60), (60,), (60,), (60, 120), (60, 60), (60,), (60,)), parameters=36480\n",
      "  (attn_step1): Linear(in_features=120, out_features=60, bias=True), weights=((60, 120), (60,)), parameters=7260\n",
      "  (attn_step2): Linear(in_features=60, out_features=1, bias=True), weights=((1, 60), (1,)), parameters=61\n",
      "  (fc1): Linear(in_features=960, out_features=120, bias=True), weights=((120, 960), (120,)), parameters=115320\n",
      "  (fc2): Linear(in_features=120, out_features=2, bias=True), weights=((2, 120), (2,)), parameters=242\n",
      ")\n",
      "====================\n",
      "\n",
      "\n",
      "\u001b[92m2020-09-10 18:15:02\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_18:15:02 -- Epoch: 1/10; Train; loss: 0.448; acc: 0.791; precision: 0.777, recall: 0.817, macrof1: 0.791, weightedf1: 0.791\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=322.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:15:09\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_18:15:09 -- Epoch: 1/10; Valid; loss: 0.386; acc: 0.829; precision: 0.862, recall: 0.783, macrof1: 0.828, weightedf1: 0.828\u001b[0m\n",
      "\u001b[92m2020-09-10 18:15:09\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=1000.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:15:37\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_18:15:37 -- Epoch: 2/10; Train; loss: 0.337; acc: 0.855; precision: 0.839, recall: 0.878, macrof1: 0.854, weightedf1: 0.854\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=322.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:15:43\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_18:15:43 -- Epoch: 2/10; Valid; loss: 0.346; acc: 0.849; precision: 0.809, recall: 0.915, macrof1: 0.849, weightedf1: 0.849\u001b[0m\n",
      "\u001b[92m2020-09-10 18:15:43\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=1000.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:16:10\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_18:16:10 -- Epoch: 3/10; Train; loss: 0.298; acc: 0.873; precision: 0.858, recall: 0.895, macrof1: 0.873, weightedf1: 0.873\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=322.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:16:17\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_18:16:17 -- Epoch: 3/10; Valid; loss: 0.323; acc: 0.862; precision: 0.850, recall: 0.880, macrof1: 0.862, weightedf1: 0.862\u001b[0m\n",
      "\u001b[92m2020-09-10 18:16:17\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=1000.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:16:45\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_18:16:45 -- Epoch: 4/10; Train; loss: 0.271; acc: 0.885; precision: 0.870, recall: 0.905, macrof1: 0.885, weightedf1: 0.885\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=322.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:16:51\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_18:16:51 -- Epoch: 4/10; Valid; loss: 0.324; acc: 0.861; precision: 0.836, recall: 0.898, macrof1: 0.860, weightedf1: 0.860\u001b[0m\n",
      "\u001b[92m2020-09-10 18:16:51\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=1000.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:17:20\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_18:17:20 -- Epoch: 5/10; Train; loss: 0.250; acc: 0.893; precision: 0.880, recall: 0.911, macrof1: 0.893, weightedf1: 0.893\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=322.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:17:27\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_18:17:27 -- Epoch: 5/10; Valid; loss: 0.323; acc: 0.864; precision: 0.829, recall: 0.918, macrof1: 0.864, weightedf1: 0.864\u001b[0m\n",
      "\u001b[92m2020-09-10 18:17:27\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=1000.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:17:56\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_18:17:56 -- Epoch: 6/10; Train; loss: 0.233; acc: 0.901; precision: 0.889, recall: 0.917, macrof1: 0.901, weightedf1: 0.901\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=322.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:18:04\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_18:18:04 -- Epoch: 6/10; Valid; loss: 0.319; acc: 0.867; precision: 0.865, recall: 0.869, macrof1: 0.867, weightedf1: 0.867\u001b[0m\n",
      "\u001b[92m2020-09-10 18:18:04\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=1000.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:18:33\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_18:18:33 -- Epoch: 7/10; Train; loss: 0.217; acc: 0.909; precision: 0.898, recall: 0.924, macrof1: 0.909, weightedf1: 0.909\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=322.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:18:39\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_18:18:39 -- Epoch: 7/10; Valid; loss: 0.337; acc: 0.863; precision: 0.832, recall: 0.910, macrof1: 0.863, weightedf1: 0.863\u001b[0m\n",
      "\u001b[92m2020-09-10 18:18:39\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=1000.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:19:11\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_18:19:11 -- Epoch: 8/10; Train; loss: 0.203; acc: 0.915; precision: 0.904, recall: 0.928, macrof1: 0.915, weightedf1: 0.915\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=322.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:19:22\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_18:19:22 -- Epoch: 8/10; Valid; loss: 0.336; acc: 0.867; precision: 0.833, recall: 0.917, macrof1: 0.867, weightedf1: 0.867\u001b[0m\n",
      "\u001b[92m2020-09-10 18:19:22\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=1000.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:20:00\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_18:20:00 -- Epoch: 9/10; Train; loss: 0.190; acc: 0.922; precision: 0.912, recall: 0.935, macrof1: 0.922, weightedf1: 0.922\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=322.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:20:08\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_18:20:08 -- Epoch: 9/10; Valid; loss: 0.339; acc: 0.869; precision: 0.857, recall: 0.886, macrof1: 0.869, weightedf1: 0.869\u001b[0m\n",
      "\u001b[92m2020-09-10 18:20:08\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=1000.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:20:37\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_18:20:37 -- Epoch: 10/10; Train; loss: 0.180; acc: 0.926; precision: 0.917, recall: 0.937, macrof1: 0.926, weightedf1: 0.926\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=322.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:20:43\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_18:20:43 -- Epoch: 10/10; Valid; loss: 0.359; acc: 0.861; precision: 0.879, recall: 0.837, macrof1: 0.861, weightedf1: 0.861\u001b[0m\n",
      "\u001b[92m2020-09-10 18:20:43\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n",
      "\n",
      "\u001b[92m2020-09-10 18:20:43\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model with least valid loss (checkpoint: 6) at ./models/wikigaz_en_ft_ocr_rnn_v001_n64000/wikigaz_en_ft_ocr_rnn_v001_n64000.model\u001b[0m\n",
      "\n",
      "\n",
      "\n",
      "====================\n",
      "User time: 368.7503\n",
      "====================\n"
     ]
    }
   ],
   "source": [
    "from DeezyMatch import finetune as dm_finetune\n",
    "\n",
    "n_ft_examples = 64000\n",
    "\n",
    "# fine-tune a pretrained model stored at pretrained_model_path and pretrained_vocab_path \n",
    "dm_finetune(input_file_path=\"./inputs/input_dfm_rnn_model_A_no_early_stopping.yaml\", \n",
    "            dataset_path=\"./dataset/ocr_trainval.txt\", \n",
    "            model_name=f\"wikigaz_en_ft_ocr_rnn_v001_n{n_ft_examples}\",\n",
    "            pretrained_model_path=\"./models/wikigaz_en_rnn_001/wikigaz_en_rnn_001.model\", \n",
    "            pretrained_vocab_path=\"./models/wikigaz_en_rnn_001/wikigaz_en_rnn_001.vocab\",\n",
    "            n_train_examples=n_ft_examples\n",
    "           )"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 44,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:20:44\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mread input file: ./inputs/input_dfm_rnn_model_A_no_early_stopping.yaml\u001b[0m\n",
      "\u001b[92m2020-09-10 18:20:44\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32mpytorch will use: cuda:1\u001b[0m\n",
      "\u001b[92m2020-09-10 18:20:44\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mread CSV file: ./dataset/ocr_trainval.txt\u001b[0m\n",
      "\u001b[92m2020-09-10 18:20:44\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32mnumber of labels, True: 42301 and False: 42302\u001b[0m\n",
      "\u001b[92m2020-09-10 18:20:44\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mSplitting the Dataset\u001b[0m\n",
      "\u001b[92m2020-09-10 18:20:44\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mfinish splitting the Dataset. User time: 0.05586600303649902\u001b[0m\n",
      "\u001b[92m2020-09-10 18:20:44\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msplits are as follow:\n",
      "train    84000\n",
      "val        603\n",
      "Name: split, dtype: int64\u001b[0m\n",
      "\u001b[92m2020-09-10 18:20:44\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mstart creating a lookup table and convert characters to indices\u001b[0m\n",
      "\u001b[92m2020-09-10 18:20:45\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32m-- create vocabulary\u001b[0m\n",
      "\u001b[92m2020-09-10 18:20:46\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32m-- convert tokens to indices\u001b[0m\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      "length s1:   0%|          | 0/84000 [00:00<?, ?it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:20:46\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mskipping 0 lines\u001b[0m\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "                                                                     \r"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n",
      "============================================================\n",
      "List all parameters in the model\n",
      "============================================================\n",
      "emb.weight False\n",
      "rnn_1.weight_ih_l0 False\n",
      "rnn_1.weight_hh_l0 False\n",
      "rnn_1.bias_ih_l0 False\n",
      "rnn_1.bias_hh_l0 False\n",
      "rnn_1.weight_ih_l0_reverse False\n",
      "rnn_1.weight_hh_l0_reverse False\n",
      "rnn_1.bias_ih_l0_reverse False\n",
      "rnn_1.bias_hh_l0_reverse False\n",
      "rnn_1.weight_ih_l1 False\n",
      "rnn_1.weight_hh_l1 False\n",
      "rnn_1.bias_ih_l1 False\n",
      "rnn_1.bias_hh_l1 False\n",
      "rnn_1.weight_ih_l1_reverse False\n",
      "rnn_1.weight_hh_l1_reverse False\n",
      "rnn_1.bias_ih_l1_reverse False\n",
      "rnn_1.bias_hh_l1_reverse False\n",
      "attn_step1.weight False\n",
      "attn_step1.bias False\n",
      "attn_step2.weight False\n",
      "attn_step2.bias False\n",
      "fc1.weight True\n",
      "fc1.bias True\n",
      "fc2.weight True\n",
      "fc2.bias True\n",
      "============================================================\n",
      "\n",
      "\n",
      "\n",
      "\u001b[92m2020-09-10 18:20:47\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m******************************\u001b[0m\n",
      "\u001b[92m2020-09-10 18:20:47\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m**** (Bi-directional) RNN ****\u001b[0m\n",
      "\u001b[92m2020-09-10 18:20:47\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m******************************\u001b[0m\n",
      "\u001b[92m2020-09-10 18:20:47\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mNumber of batches: 1313\u001b[0m\n",
      "\u001b[92m2020-09-10 18:20:47\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mNumber of epochs: 10\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "e1ddb00aa9314e7bb77b670359377c40",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=10.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=1313.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n",
      "\n",
      "====================\n",
      "Total number of params: 611883\n",
      "\n",
      "two_parallel_rnns (\n",
      "  (emb): Embedding(7542, 60), weights=((7542, 60),), parameters=452520\n",
      "  (rnn_1): RNN(60, 60, num_layers=2, dropout=0.01, bidirectional=True), weights=((60, 60), (60, 60), (60,), (60,), (60, 60), (60, 60), (60,), (60,), (60, 120), (60, 60), (60,), (60,), (60, 120), (60, 60), (60,), (60,)), parameters=36480\n",
      "  (attn_step1): Linear(in_features=120, out_features=60, bias=True), weights=((60, 120), (60,)), parameters=7260\n",
      "  (attn_step2): Linear(in_features=60, out_features=1, bias=True), weights=((1, 60), (1,)), parameters=61\n",
      "  (fc1): Linear(in_features=960, out_features=120, bias=True), weights=((120, 960), (120,)), parameters=115320\n",
      "  (fc2): Linear(in_features=120, out_features=2, bias=True), weights=((2, 120), (2,)), parameters=242\n",
      ")\n",
      "====================\n",
      "\n",
      "\n",
      "\u001b[92m2020-09-10 18:21:43\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_18:21:43 -- Epoch: 1/10; Train; loss: 0.427; acc: 0.805; precision: 0.790, recall: 0.830, macrof1: 0.804, weightedf1: 0.804\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=10.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:21:43\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_18:21:43 -- Epoch: 1/10; Valid; loss: 0.365; acc: 0.836; precision: 0.824, recall: 0.854, macrof1: 0.836, weightedf1: 0.836\u001b[0m\n",
      "\u001b[92m2020-09-10 18:21:43\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=1313.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:22:21\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_18:22:21 -- Epoch: 2/10; Train; loss: 0.323; acc: 0.861; precision: 0.845, recall: 0.884, macrof1: 0.861, weightedf1: 0.861\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=10.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:22:21\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_18:22:21 -- Epoch: 2/10; Valid; loss: 0.321; acc: 0.864; precision: 0.829, recall: 0.917, macrof1: 0.864, weightedf1: 0.864\u001b[0m\n",
      "\u001b[92m2020-09-10 18:22:21\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=1313.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:23:01\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_18:23:01 -- Epoch: 3/10; Train; loss: 0.289; acc: 0.878; precision: 0.863, recall: 0.898, macrof1: 0.878, weightedf1: 0.878\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=10.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:23:01\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_18:23:01 -- Epoch: 3/10; Valid; loss: 0.325; acc: 0.864; precision: 0.833, recall: 0.910, macrof1: 0.864, weightedf1: 0.864\u001b[0m\n",
      "\u001b[92m2020-09-10 18:23:01\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=1313.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:23:57\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_18:23:57 -- Epoch: 4/10; Train; loss: 0.265; acc: 0.889; precision: 0.875, recall: 0.907, macrof1: 0.889, weightedf1: 0.889\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=10.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:23:58\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_18:23:58 -- Epoch: 4/10; Valid; loss: 0.312; acc: 0.876; precision: 0.865, recall: 0.890, macrof1: 0.876, weightedf1: 0.876\u001b[0m\n",
      "\u001b[92m2020-09-10 18:23:58\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=1313.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:24:36\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_18:24:36 -- Epoch: 5/10; Train; loss: 0.245; acc: 0.897; precision: 0.885, recall: 0.914, macrof1: 0.897, weightedf1: 0.897\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=10.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:24:36\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_18:24:36 -- Epoch: 5/10; Valid; loss: 0.318; acc: 0.872; precision: 0.852, recall: 0.900, macrof1: 0.872, weightedf1: 0.872\u001b[0m\n",
      "\u001b[92m2020-09-10 18:24:36\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=1313.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:25:16\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_18:25:16 -- Epoch: 6/10; Train; loss: 0.230; acc: 0.904; precision: 0.893, recall: 0.919, macrof1: 0.904, weightedf1: 0.904\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=10.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:25:16\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_18:25:16 -- Epoch: 6/10; Valid; loss: 0.324; acc: 0.871; precision: 0.845, recall: 0.907, macrof1: 0.870, weightedf1: 0.870\u001b[0m\n",
      "\u001b[92m2020-09-10 18:25:16\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=1313.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:25:59\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_18:25:59 -- Epoch: 7/10; Train; loss: 0.216; acc: 0.910; precision: 0.899, recall: 0.924, macrof1: 0.910, weightedf1: 0.910\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=10.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:26:00\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_18:26:00 -- Epoch: 7/10; Valid; loss: 0.329; acc: 0.869; precision: 0.849, recall: 0.897, macrof1: 0.869, weightedf1: 0.869\u001b[0m\n",
      "\u001b[92m2020-09-10 18:26:00\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=1313.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:26:52\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_18:26:52 -- Epoch: 8/10; Train; loss: 0.204; acc: 0.915; precision: 0.904, recall: 0.928, macrof1: 0.915, weightedf1: 0.915\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=10.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:26:52\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_18:26:52 -- Epoch: 8/10; Valid; loss: 0.382; acc: 0.859; precision: 0.827, recall: 0.907, macrof1: 0.859, weightedf1: 0.859\u001b[0m\n",
      "\u001b[92m2020-09-10 18:26:52\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=1313.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:27:32\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_18:27:32 -- Epoch: 9/10; Train; loss: 0.194; acc: 0.920; precision: 0.910, recall: 0.931, macrof1: 0.920, weightedf1: 0.920\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=10.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:27:32\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_18:27:32 -- Epoch: 9/10; Valid; loss: 0.337; acc: 0.876; precision: 0.862, recall: 0.894, macrof1: 0.876, weightedf1: 0.876\u001b[0m\n",
      "\u001b[92m2020-09-10 18:27:32\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=1313.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:28:29\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_18:28:29 -- Epoch: 10/10; Train; loss: 0.184; acc: 0.924; precision: 0.914, recall: 0.935, macrof1: 0.924, weightedf1: 0.924\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=10.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:28:29\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_18:28:29 -- Epoch: 10/10; Valid; loss: 0.381; acc: 0.867; precision: 0.860, recall: 0.877, macrof1: 0.867, weightedf1: 0.867\u001b[0m\n",
      "\u001b[92m2020-09-10 18:28:29\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n",
      "\n",
      "\u001b[92m2020-09-10 18:28:29\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model with least valid loss (checkpoint: 4) at ./models/wikigaz_en_ft_ocr_rnn_v001_n84000/wikigaz_en_ft_ocr_rnn_v001_n84000.model\u001b[0m\n",
      "\n",
      "\n",
      "\n",
      "====================\n",
      "User time: 461.5594\n",
      "====================\n"
     ]
    }
   ],
   "source": [
    "from DeezyMatch import finetune as dm_finetune\n",
    "\n",
    "n_ft_examples = 84000\n",
    "\n",
    "# fine-tune a pretrained model stored at pretrained_model_path and pretrained_vocab_path \n",
    "dm_finetune(input_file_path=\"./inputs/input_dfm_rnn_model_A_no_early_stopping.yaml\", \n",
    "            dataset_path=\"./dataset/ocr_trainval.txt\", \n",
    "            model_name=f\"wikigaz_en_ft_ocr_rnn_v001_n{n_ft_examples}\",\n",
    "            pretrained_model_path=\"./models/wikigaz_en_rnn_001/wikigaz_en_rnn_001.model\", \n",
    "            pretrained_vocab_path=\"./models/wikigaz_en_rnn_001/wikigaz_en_rnn_001.vocab\",\n",
    "            n_train_examples=n_ft_examples\n",
    "           )"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Fine-Tune, model B, GRU"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 45,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:28:29\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mread input file: ./inputs/input_dfm_gru_model_B.yaml\u001b[0m\n",
      "\u001b[92m2020-09-10 18:28:29\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32mpytorch will use: cuda:1\u001b[0m\n",
      "\u001b[92m2020-09-10 18:28:29\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mread CSV file: ./dataset/ocr_trainval.txt\u001b[0m\n",
      "\u001b[92m2020-09-10 18:28:29\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32mnumber of labels, True: 42301 and False: 42302\u001b[0m\n",
      "\u001b[92m2020-09-10 18:28:29\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mSplitting the Dataset\u001b[0m\n",
      "\u001b[92m2020-09-10 18:28:29\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mfinish splitting the Dataset. User time: 0.05349993705749512\u001b[0m\n",
      "\u001b[92m2020-09-10 18:28:29\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msplits are as follow:\n",
      "not_assigned    58971\n",
      "val             25380\n",
      "train             250\n",
      "test                2\n",
      "Name: split, dtype: int64\u001b[0m\n",
      "\u001b[92m2020-09-10 18:28:29\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mstart creating a lookup table and convert characters to indices\u001b[0m\n",
      "\u001b[92m2020-09-10 18:28:30\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32m-- create vocabulary\u001b[0m\n",
      "\u001b[92m2020-09-10 18:28:31\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32m-- convert tokens to indices\u001b[0m\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "length s2:   0%|          | 0/25380 [00:00<?, ?it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:28:31\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mskipping 0 lines\u001b[0m\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "                                                                     \r"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n",
      "============================================================\n",
      "List all parameters in the model\n",
      "============================================================\n",
      "emb.weight False\n",
      "rnn_1.weight_ih_l0 True\n",
      "rnn_1.weight_hh_l0 True\n",
      "rnn_1.bias_ih_l0 True\n",
      "rnn_1.bias_hh_l0 True\n",
      "rnn_1.weight_ih_l0_reverse True\n",
      "rnn_1.weight_hh_l0_reverse True\n",
      "rnn_1.bias_ih_l0_reverse True\n",
      "rnn_1.bias_hh_l0_reverse True\n",
      "rnn_1.weight_ih_l1 True\n",
      "rnn_1.weight_hh_l1 True\n",
      "rnn_1.bias_ih_l1 True\n",
      "rnn_1.bias_hh_l1 True\n",
      "rnn_1.weight_ih_l1_reverse True\n",
      "rnn_1.weight_hh_l1_reverse True\n",
      "rnn_1.bias_ih_l1_reverse True\n",
      "rnn_1.bias_hh_l1_reverse True\n",
      "attn_step1.weight True\n",
      "attn_step1.bias True\n",
      "attn_step2.weight True\n",
      "attn_step2.bias True\n",
      "fc1.weight True\n",
      "fc1.bias True\n",
      "fc2.weight True\n",
      "fc2.bias True\n",
      "============================================================\n",
      "\n",
      "\n",
      "\n",
      "\u001b[92m2020-09-10 18:28:32\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m******************************\u001b[0m\n",
      "\u001b[92m2020-09-10 18:28:32\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m**** (Bi-directional) GRU ****\u001b[0m\n",
      "\u001b[92m2020-09-10 18:28:32\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m******************************\u001b[0m\n",
      "\u001b[92m2020-09-10 18:28:32\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mNumber of batches: 4\u001b[0m\n",
      "\u001b[92m2020-09-10 18:28:32\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mNumber of epochs: 20\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "00a0a2f48aee4793a8c31f3fdbddccb0",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=20.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=4.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n",
      "\n",
      "====================\n",
      "Total number of params: 684843\n",
      "\n",
      "two_parallel_rnns (\n",
      "  (emb): Embedding(7542, 60), weights=((7542, 60),), parameters=452520\n",
      "  (rnn_1): GRU(60, 60, num_layers=2, dropout=0.01, bidirectional=True), weights=((180, 60), (180, 60), (180,), (180,), (180, 60), (180, 60), (180,), (180,), (180, 120), (180, 60), (180,), (180,), (180, 120), (180, 60), (180,), (180,)), parameters=109440\n",
      "  (attn_step1): Linear(in_features=120, out_features=60, bias=True), weights=((60, 120), (60,)), parameters=7260\n",
      "  (attn_step2): Linear(in_features=60, out_features=1, bias=True), weights=((1, 60), (1,)), parameters=61\n",
      "  (fc1): Linear(in_features=960, out_features=120, bias=True), weights=((120, 960), (120,)), parameters=115320\n",
      "  (fc2): Linear(in_features=120, out_features=2, bias=True), weights=((2, 120), (2,)), parameters=242\n",
      ")\n",
      "====================\n",
      "\n",
      "\n",
      "\u001b[92m2020-09-10 18:28:32\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_18:28:32 -- Epoch: 1/20; Train; loss: 1.552; acc: 0.460; precision: 0.456, recall: 0.416, macrof1: 0.459, weightedf1: 0.459\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:28:40\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_18:28:40 -- Epoch: 1/20; Valid; loss: 1.395; acc: 0.483; precision: 0.481, recall: 0.436, macrof1: 0.482, weightedf1: 0.482\u001b[0m\n",
      "\u001b[92m2020-09-10 18:28:40\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=4.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:28:41\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_18:28:41 -- Epoch: 2/20; Train; loss: 0.882; acc: 0.612; precision: 0.637, recall: 0.520, macrof1: 0.609, weightedf1: 0.609\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:28:49\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_18:28:49 -- Epoch: 2/20; Valid; loss: 1.187; acc: 0.507; precision: 0.508, recall: 0.483, macrof1: 0.507, weightedf1: 0.507\u001b[0m\n",
      "\u001b[92m2020-09-10 18:28:49\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=4.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:28:49\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_18:28:49 -- Epoch: 3/20; Train; loss: 0.596; acc: 0.724; precision: 0.764, recall: 0.648, macrof1: 0.722, weightedf1: 0.722\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:28:57\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_18:28:57 -- Epoch: 3/20; Valid; loss: 1.060; acc: 0.532; precision: 0.530, recall: 0.558, macrof1: 0.532, weightedf1: 0.532\u001b[0m\n",
      "\u001b[92m2020-09-10 18:28:57\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=4.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:28:58\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_18:28:58 -- Epoch: 4/20; Train; loss: 0.449; acc: 0.808; precision: 0.824, recall: 0.784, macrof1: 0.808, weightedf1: 0.808\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:29:06\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_18:29:06 -- Epoch: 4/20; Valid; loss: 0.989; acc: 0.549; precision: 0.543, recall: 0.618, macrof1: 0.547, weightedf1: 0.547\u001b[0m\n",
      "\u001b[92m2020-09-10 18:29:06\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=4.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:29:06\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_18:29:06 -- Epoch: 5/20; Train; loss: 0.354; acc: 0.856; precision: 0.845, recall: 0.872, macrof1: 0.856, weightedf1: 0.856\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:29:15\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_18:29:15 -- Epoch: 5/20; Valid; loss: 0.952; acc: 0.564; precision: 0.554, recall: 0.654, macrof1: 0.560, weightedf1: 0.560\u001b[0m\n",
      "\u001b[92m2020-09-10 18:29:15\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=4.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:29:15\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_18:29:15 -- Epoch: 6/20; Train; loss: 0.292; acc: 0.912; precision: 0.893, recall: 0.936, macrof1: 0.912, weightedf1: 0.912\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:29:23\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_18:29:23 -- Epoch: 6/20; Valid; loss: 0.930; acc: 0.576; precision: 0.564, recall: 0.664, macrof1: 0.572, weightedf1: 0.572\u001b[0m\n",
      "\u001b[92m2020-09-10 18:29:23\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=4.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:29:23\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_18:29:23 -- Epoch: 7/20; Train; loss: 0.237; acc: 0.932; precision: 0.915, recall: 0.952, macrof1: 0.932, weightedf1: 0.932\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:29:32\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_18:29:32 -- Epoch: 7/20; Valid; loss: 0.921; acc: 0.587; precision: 0.575, recall: 0.668, macrof1: 0.584, weightedf1: 0.584\u001b[0m\n",
      "\u001b[92m2020-09-10 18:29:32\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=4.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:29:32\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_18:29:32 -- Epoch: 8/20; Train; loss: 0.195; acc: 0.948; precision: 0.918, recall: 0.984, macrof1: 0.948, weightedf1: 0.948\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:29:40\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_18:29:40 -- Epoch: 8/20; Valid; loss: 0.922; acc: 0.597; precision: 0.585, recall: 0.664, macrof1: 0.595, weightedf1: 0.595\u001b[0m\n",
      "\u001b[92m2020-09-10 18:29:40\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model (early stopped) with least valid loss (checkpoint: 7) at ./models/wikigaz_en_ft_ocr_gru_v002_n250/wikigaz_en_ft_ocr_gru_v002_n250.model\u001b[0m\n",
      "\u001b[92m2020-09-10 18:29:40\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n",
      "\u001b[92m2020-09-10 18:29:40\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mEarly stopping at epoch: 8, selected epoch: 7\u001b[0m\n",
      "\n",
      "\n",
      "\n",
      "\n",
      "====================\n",
      "User time: 68.0159\n",
      "====================\n"
     ]
    }
   ],
   "source": [
    "from DeezyMatch import finetune as dm_finetune\n",
    "\n",
    "# fine-tune a pretrained model stored at pretrained_model_path and pretrained_vocab_path \n",
    "dm_finetune(input_file_path=\"./inputs/input_dfm_gru_model_B.yaml\", \n",
    "            dataset_path=\"./dataset/ocr_trainval.txt\", \n",
    "            model_name=\"wikigaz_en_ft_ocr_gru_v002_n250\",\n",
    "            pretrained_model_path=\"./models/wikigaz_en_gru_001/wikigaz_en_gru_001.model\", \n",
    "            pretrained_vocab_path=\"./models/wikigaz_en_gru_001/wikigaz_en_gru_001.vocab\",\n",
    "            n_train_examples=250\n",
    "           )"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 46,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:29:40\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mread input file: ./inputs/input_dfm_gru_model_B.yaml\u001b[0m\n",
      "\u001b[92m2020-09-10 18:29:40\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32mpytorch will use: cuda:1\u001b[0m\n",
      "\u001b[92m2020-09-10 18:29:40\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mread CSV file: ./dataset/ocr_trainval.txt\u001b[0m\n",
      "\u001b[92m2020-09-10 18:29:41\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32mnumber of labels, True: 42301 and False: 42302\u001b[0m\n",
      "\u001b[92m2020-09-10 18:29:41\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mSplitting the Dataset\u001b[0m\n",
      "\u001b[92m2020-09-10 18:29:41\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mfinish splitting the Dataset. User time: 0.048142194747924805\u001b[0m\n",
      "\u001b[92m2020-09-10 18:29:41\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msplits are as follow:\n",
      "not_assigned    58721\n",
      "val             25380\n",
      "train             500\n",
      "test                2\n",
      "Name: split, dtype: int64\u001b[0m\n",
      "\u001b[92m2020-09-10 18:29:41\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mstart creating a lookup table and convert characters to indices\u001b[0m\n",
      "\u001b[92m2020-09-10 18:29:41\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32m-- create vocabulary\u001b[0m\n",
      "\u001b[92m2020-09-10 18:29:42\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32m-- convert tokens to indices\u001b[0m\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "length s2:   0%|          | 0/25380 [00:00<?, ?it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:29:42\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mskipping 0 lines\u001b[0m\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "                                                                     \r"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n",
      "============================================================\n",
      "List all parameters in the model\n",
      "============================================================\n",
      "emb.weight False\n",
      "rnn_1.weight_ih_l0 True\n",
      "rnn_1.weight_hh_l0 True\n",
      "rnn_1.bias_ih_l0 True\n",
      "rnn_1.bias_hh_l0 True\n",
      "rnn_1.weight_ih_l0_reverse True\n",
      "rnn_1.weight_hh_l0_reverse True\n",
      "rnn_1.bias_ih_l0_reverse True\n",
      "rnn_1.bias_hh_l0_reverse True\n",
      "rnn_1.weight_ih_l1 True\n",
      "rnn_1.weight_hh_l1 True\n",
      "rnn_1.bias_ih_l1 True\n",
      "rnn_1.bias_hh_l1 True\n",
      "rnn_1.weight_ih_l1_reverse True\n",
      "rnn_1.weight_hh_l1_reverse True\n",
      "rnn_1.bias_ih_l1_reverse True\n",
      "rnn_1.bias_hh_l1_reverse True\n",
      "attn_step1.weight True\n",
      "attn_step1.bias True\n",
      "attn_step2.weight True\n",
      "attn_step2.bias True\n",
      "fc1.weight True\n",
      "fc1.bias True\n",
      "fc2.weight True\n",
      "fc2.bias True\n",
      "============================================================\n",
      "\n",
      "\n",
      "\n",
      "\u001b[92m2020-09-10 18:29:43\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m******************************\u001b[0m\n",
      "\u001b[92m2020-09-10 18:29:43\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m**** (Bi-directional) GRU ****\u001b[0m\n",
      "\u001b[92m2020-09-10 18:29:43\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m******************************\u001b[0m\n",
      "\u001b[92m2020-09-10 18:29:43\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mNumber of batches: 8\u001b[0m\n",
      "\u001b[92m2020-09-10 18:29:43\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mNumber of epochs: 20\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "539a72667413498dbc78d685fe9f579b",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=20.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=8.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n",
      "\n",
      "====================\n",
      "Total number of params: 684843\n",
      "\n",
      "two_parallel_rnns (\n",
      "  (emb): Embedding(7542, 60), weights=((7542, 60),), parameters=452520\n",
      "  (rnn_1): GRU(60, 60, num_layers=2, dropout=0.01, bidirectional=True), weights=((180, 60), (180, 60), (180,), (180,), (180, 60), (180, 60), (180,), (180,), (180, 120), (180, 60), (180,), (180,), (180, 120), (180, 60), (180,), (180,)), parameters=109440\n",
      "  (attn_step1): Linear(in_features=120, out_features=60, bias=True), weights=((60, 120), (60,)), parameters=7260\n",
      "  (attn_step2): Linear(in_features=60, out_features=1, bias=True), weights=((1, 60), (1,)), parameters=61\n",
      "  (fc1): Linear(in_features=960, out_features=120, bias=True), weights=((120, 960), (120,)), parameters=115320\n",
      "  (fc2): Linear(in_features=120, out_features=2, bias=True), weights=((2, 120), (2,)), parameters=242\n",
      ")\n",
      "====================\n",
      "\n",
      "\n",
      "\u001b[92m2020-09-10 18:29:43\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_18:29:43 -- Epoch: 1/20; Train; loss: 1.492; acc: 0.472; precision: 0.468, recall: 0.408, macrof1: 0.470, weightedf1: 0.470\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:29:52\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_18:29:52 -- Epoch: 1/20; Valid; loss: 1.160; acc: 0.513; precision: 0.514, recall: 0.489, macrof1: 0.513, weightedf1: 0.513\u001b[0m\n",
      "\u001b[92m2020-09-10 18:29:52\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=8.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:29:52\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_18:29:52 -- Epoch: 2/20; Train; loss: 0.790; acc: 0.618; precision: 0.628, recall: 0.580, macrof1: 0.617, weightedf1: 0.617\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:30:00\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_18:30:00 -- Epoch: 2/20; Valid; loss: 0.900; acc: 0.563; precision: 0.555, recall: 0.635, macrof1: 0.560, weightedf1: 0.560\u001b[0m\n",
      "\u001b[92m2020-09-10 18:30:00\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=8.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:30:01\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_18:30:01 -- Epoch: 3/20; Train; loss: 0.555; acc: 0.726; precision: 0.716, recall: 0.748, macrof1: 0.726, weightedf1: 0.726\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:30:09\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_18:30:09 -- Epoch: 3/20; Valid; loss: 0.794; acc: 0.597; precision: 0.581, recall: 0.695, macrof1: 0.593, weightedf1: 0.593\u001b[0m\n",
      "\u001b[92m2020-09-10 18:30:09\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=8.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:30:09\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_18:30:09 -- Epoch: 4/20; Train; loss: 0.448; acc: 0.806; precision: 0.782, recall: 0.848, macrof1: 0.806, weightedf1: 0.806\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:30:18\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_18:30:18 -- Epoch: 4/20; Valid; loss: 0.746; acc: 0.622; precision: 0.604, recall: 0.706, macrof1: 0.619, weightedf1: 0.619\u001b[0m\n",
      "\u001b[92m2020-09-10 18:30:18\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=8.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:30:18\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_18:30:18 -- Epoch: 5/20; Train; loss: 0.371; acc: 0.850; precision: 0.849, recall: 0.852, macrof1: 0.850, weightedf1: 0.850\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:30:26\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_18:30:26 -- Epoch: 5/20; Valid; loss: 0.728; acc: 0.641; precision: 0.626, recall: 0.699, macrof1: 0.639, weightedf1: 0.639\u001b[0m\n",
      "\u001b[92m2020-09-10 18:30:26\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=8.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:30:27\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_18:30:27 -- Epoch: 6/20; Train; loss: 0.303; acc: 0.902; precision: 0.900, recall: 0.904, macrof1: 0.902, weightedf1: 0.902\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:30:35\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_18:30:35 -- Epoch: 6/20; Valid; loss: 0.735; acc: 0.658; precision: 0.641, recall: 0.717, macrof1: 0.656, weightedf1: 0.656\u001b[0m\n",
      "\u001b[92m2020-09-10 18:30:35\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model (early stopped) with least valid loss (checkpoint: 5) at ./models/wikigaz_en_ft_ocr_gru_v002_n500/wikigaz_en_ft_ocr_gru_v002_n500.model\u001b[0m\n",
      "\u001b[92m2020-09-10 18:30:35\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n",
      "\u001b[92m2020-09-10 18:30:35\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mEarly stopping at epoch: 6, selected epoch: 5\u001b[0m\n",
      "\n",
      "\n",
      "\n",
      "\n",
      "====================\n",
      "User time: 52.2374\n",
      "====================\n"
     ]
    }
   ],
   "source": [
    "from DeezyMatch import finetune as dm_finetune\n",
    "\n",
    "n_ft_examples = 500\n",
    "\n",
    "# fine-tune a pretrained model stored at pretrained_model_path and pretrained_vocab_path \n",
    "dm_finetune(input_file_path=\"./inputs/input_dfm_gru_model_B.yaml\", \n",
    "            dataset_path=\"./dataset/ocr_trainval.txt\", \n",
    "            model_name=f\"wikigaz_en_ft_ocr_gru_v002_n{n_ft_examples}\",\n",
    "            pretrained_model_path=\"./models/wikigaz_en_gru_001/wikigaz_en_gru_001.model\", \n",
    "            pretrained_vocab_path=\"./models/wikigaz_en_gru_001/wikigaz_en_gru_001.vocab\",\n",
    "            n_train_examples=n_ft_examples\n",
    "           )"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 47,
   "metadata": {
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:30:35\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mread input file: ./inputs/input_dfm_gru_model_B.yaml\u001b[0m\n",
      "\u001b[92m2020-09-10 18:30:35\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32mpytorch will use: cuda:1\u001b[0m\n",
      "\u001b[92m2020-09-10 18:30:35\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mread CSV file: ./dataset/ocr_trainval.txt\u001b[0m\n",
      "\u001b[92m2020-09-10 18:30:36\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32mnumber of labels, True: 42301 and False: 42302\u001b[0m\n",
      "\u001b[92m2020-09-10 18:30:36\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mSplitting the Dataset\u001b[0m\n",
      "\u001b[92m2020-09-10 18:30:36\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mfinish splitting the Dataset. User time: 0.05417013168334961\u001b[0m\n",
      "\u001b[92m2020-09-10 18:30:36\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msplits are as follow:\n",
      "not_assigned    58221\n",
      "val             25380\n",
      "train            1000\n",
      "test                2\n",
      "Name: split, dtype: int64\u001b[0m\n",
      "\u001b[92m2020-09-10 18:30:36\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mstart creating a lookup table and convert characters to indices\u001b[0m\n",
      "\u001b[92m2020-09-10 18:30:36\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32m-- create vocabulary\u001b[0m\n",
      "\u001b[92m2020-09-10 18:30:37\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32m-- convert tokens to indices\u001b[0m\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "                                                    "
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:30:38\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mskipping 0 lines\u001b[0m\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "                                                                     \r"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n",
      "============================================================\n",
      "List all parameters in the model\n",
      "============================================================\n",
      "emb.weight False\n",
      "rnn_1.weight_ih_l0 True\n",
      "rnn_1.weight_hh_l0 True\n",
      "rnn_1.bias_ih_l0 True\n",
      "rnn_1.bias_hh_l0 True\n",
      "rnn_1.weight_ih_l0_reverse True\n",
      "rnn_1.weight_hh_l0_reverse True\n",
      "rnn_1.bias_ih_l0_reverse True\n",
      "rnn_1.bias_hh_l0_reverse True\n",
      "rnn_1.weight_ih_l1 True\n",
      "rnn_1.weight_hh_l1 True\n",
      "rnn_1.bias_ih_l1 True\n",
      "rnn_1.bias_hh_l1 True\n",
      "rnn_1.weight_ih_l1_reverse True\n",
      "rnn_1.weight_hh_l1_reverse True\n",
      "rnn_1.bias_ih_l1_reverse True\n",
      "rnn_1.bias_hh_l1_reverse True\n",
      "attn_step1.weight True\n",
      "attn_step1.bias True\n",
      "attn_step2.weight True\n",
      "attn_step2.bias True\n",
      "fc1.weight True\n",
      "fc1.bias True\n",
      "fc2.weight True\n",
      "fc2.bias True\n",
      "============================================================\n",
      "\n",
      "\n",
      "\n",
      "\u001b[92m2020-09-10 18:30:38\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m******************************\u001b[0m\n",
      "\u001b[92m2020-09-10 18:30:38\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m**** (Bi-directional) GRU ****\u001b[0m\n",
      "\u001b[92m2020-09-10 18:30:38\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m******************************\u001b[0m\n",
      "\u001b[92m2020-09-10 18:30:38\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mNumber of batches: 16\u001b[0m\n",
      "\u001b[92m2020-09-10 18:30:38\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mNumber of epochs: 20\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "e5f06ec19af1418dae3a818c2c61949a",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=20.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=16.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n",
      "\n",
      "====================\n",
      "Total number of params: 684843\n",
      "\n",
      "two_parallel_rnns (\n",
      "  (emb): Embedding(7542, 60), weights=((7542, 60),), parameters=452520\n",
      "  (rnn_1): GRU(60, 60, num_layers=2, dropout=0.01, bidirectional=True), weights=((180, 60), (180, 60), (180,), (180,), (180, 60), (180, 60), (180,), (180,), (180, 120), (180, 60), (180,), (180,), (180, 120), (180, 60), (180,), (180,)), parameters=109440\n",
      "  (attn_step1): Linear(in_features=120, out_features=60, bias=True), weights=((60, 120), (60,)), parameters=7260\n",
      "  (attn_step2): Linear(in_features=60, out_features=1, bias=True), weights=((1, 60), (1,)), parameters=61\n",
      "  (fc1): Linear(in_features=960, out_features=120, bias=True), weights=((120, 960), (120,)), parameters=115320\n",
      "  (fc2): Linear(in_features=120, out_features=2, bias=True), weights=((2, 120), (2,)), parameters=242\n",
      ")\n",
      "====================\n",
      "\n",
      "\n",
      "\u001b[92m2020-09-10 18:30:39\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_18:30:39 -- Epoch: 1/20; Train; loss: 1.237; acc: 0.504; precision: 0.504, recall: 0.482, macrof1: 0.504, weightedf1: 0.504\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:30:47\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_18:30:47 -- Epoch: 1/20; Valid; loss: 0.864; acc: 0.561; precision: 0.559, recall: 0.579, macrof1: 0.561, weightedf1: 0.561\u001b[0m\n",
      "\u001b[92m2020-09-10 18:30:47\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=16.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:30:48\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_18:30:48 -- Epoch: 2/20; Train; loss: 0.626; acc: 0.660; precision: 0.652, recall: 0.688, macrof1: 0.660, weightedf1: 0.660\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:30:57\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_18:30:57 -- Epoch: 2/20; Valid; loss: 0.690; acc: 0.624; precision: 0.609, recall: 0.694, macrof1: 0.622, weightedf1: 0.622\u001b[0m\n",
      "\u001b[92m2020-09-10 18:30:57\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=16.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:30:57\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_18:30:57 -- Epoch: 3/20; Train; loss: 0.491; acc: 0.765; precision: 0.738, recall: 0.822, macrof1: 0.764, weightedf1: 0.764\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:31:06\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_18:31:06 -- Epoch: 3/20; Valid; loss: 0.643; acc: 0.658; precision: 0.642, recall: 0.714, macrof1: 0.657, weightedf1: 0.657\u001b[0m\n",
      "\u001b[92m2020-09-10 18:31:06\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=16.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:31:07\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_18:31:07 -- Epoch: 4/20; Train; loss: 0.398; acc: 0.830; precision: 0.829, recall: 0.832, macrof1: 0.830, weightedf1: 0.830\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:31:19\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_18:31:19 -- Epoch: 4/20; Valid; loss: 0.621; acc: 0.690; precision: 0.688, recall: 0.696, macrof1: 0.690, weightedf1: 0.690\u001b[0m\n",
      "\u001b[92m2020-09-10 18:31:19\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=16.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:31:20\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_18:31:20 -- Epoch: 5/20; Train; loss: 0.315; acc: 0.881; precision: 0.883, recall: 0.878, macrof1: 0.881, weightedf1: 0.881\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:31:31\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_18:31:31 -- Epoch: 5/20; Valid; loss: 0.607; acc: 0.716; precision: 0.705, recall: 0.741, macrof1: 0.715, weightedf1: 0.715\u001b[0m\n",
      "\u001b[92m2020-09-10 18:31:31\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=16.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:31:32\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_18:31:32 -- Epoch: 6/20; Train; loss: 0.234; acc: 0.925; precision: 0.928, recall: 0.922, macrof1: 0.925, weightedf1: 0.925\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:31:43\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_18:31:43 -- Epoch: 6/20; Valid; loss: 0.598; acc: 0.738; precision: 0.726, recall: 0.764, macrof1: 0.738, weightedf1: 0.738\u001b[0m\n",
      "\u001b[92m2020-09-10 18:31:44\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=16.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:31:45\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_18:31:45 -- Epoch: 7/20; Train; loss: 0.161; acc: 0.964; precision: 0.958, recall: 0.970, macrof1: 0.964, weightedf1: 0.964\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:31:54\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_18:31:54 -- Epoch: 7/20; Valid; loss: 0.598; acc: 0.748; precision: 0.750, recall: 0.742, macrof1: 0.748, weightedf1: 0.748\u001b[0m\n",
      "\u001b[92m2020-09-10 18:31:54\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model (early stopped) with least valid loss (checkpoint: 6) at ./models/wikigaz_en_ft_ocr_gru_v002_n1000/wikigaz_en_ft_ocr_gru_v002_n1000.model\u001b[0m\n",
      "\u001b[92m2020-09-10 18:31:54\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n",
      "\u001b[92m2020-09-10 18:31:54\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mEarly stopping at epoch: 7, selected epoch: 6\u001b[0m\n",
      "\n",
      "\n",
      "\n",
      "\n",
      "====================\n",
      "User time: 75.6827\n",
      "====================\n"
     ]
    }
   ],
   "source": [
    "from DeezyMatch import finetune as dm_finetune\n",
    "\n",
    "n_ft_examples = 1000\n",
    "\n",
    "# fine-tune a pretrained model stored at pretrained_model_path and pretrained_vocab_path \n",
    "dm_finetune(input_file_path=\"./inputs/input_dfm_gru_model_B.yaml\", \n",
    "            dataset_path=\"./dataset/ocr_trainval.txt\", \n",
    "            model_name=f\"wikigaz_en_ft_ocr_gru_v002_n{n_ft_examples}\",\n",
    "            pretrained_model_path=\"./models/wikigaz_en_gru_001/wikigaz_en_gru_001.model\", \n",
    "            pretrained_vocab_path=\"./models/wikigaz_en_gru_001/wikigaz_en_gru_001.vocab\",\n",
    "            n_train_examples=n_ft_examples\n",
    "           )"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 48,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:31:54\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mread input file: ./inputs/input_dfm_gru_model_B.yaml\u001b[0m\n",
      "\u001b[92m2020-09-10 18:31:54\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32mpytorch will use: cuda:1\u001b[0m\n",
      "\u001b[92m2020-09-10 18:31:54\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mread CSV file: ./dataset/ocr_trainval.txt\u001b[0m\n",
      "\u001b[92m2020-09-10 18:31:55\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32mnumber of labels, True: 42301 and False: 42302\u001b[0m\n",
      "\u001b[92m2020-09-10 18:31:55\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mSplitting the Dataset\u001b[0m\n",
      "\u001b[92m2020-09-10 18:31:55\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mfinish splitting the Dataset. User time: 0.05487489700317383\u001b[0m\n",
      "\u001b[92m2020-09-10 18:31:55\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msplits are as follow:\n",
      "not_assigned    57221\n",
      "val             25380\n",
      "train            2000\n",
      "test                2\n",
      "Name: split, dtype: int64\u001b[0m\n",
      "\u001b[92m2020-09-10 18:31:55\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mstart creating a lookup table and convert characters to indices\u001b[0m\n",
      "\u001b[92m2020-09-10 18:31:55\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32m-- create vocabulary\u001b[0m\n",
      "\u001b[92m2020-09-10 18:31:56\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32m-- convert tokens to indices\u001b[0m\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "length s2:   0%|          | 0/25380 [00:00<?, ?it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:31:56\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mskipping 0 lines\u001b[0m\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "                                                                     \r"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n",
      "============================================================\n",
      "List all parameters in the model\n",
      "============================================================\n",
      "emb.weight False\n",
      "rnn_1.weight_ih_l0 True\n",
      "rnn_1.weight_hh_l0 True\n",
      "rnn_1.bias_ih_l0 True\n",
      "rnn_1.bias_hh_l0 True\n",
      "rnn_1.weight_ih_l0_reverse True\n",
      "rnn_1.weight_hh_l0_reverse True\n",
      "rnn_1.bias_ih_l0_reverse True\n",
      "rnn_1.bias_hh_l0_reverse True\n",
      "rnn_1.weight_ih_l1 True\n",
      "rnn_1.weight_hh_l1 True\n",
      "rnn_1.bias_ih_l1 True\n",
      "rnn_1.bias_hh_l1 True\n",
      "rnn_1.weight_ih_l1_reverse True\n",
      "rnn_1.weight_hh_l1_reverse True\n",
      "rnn_1.bias_ih_l1_reverse True\n",
      "rnn_1.bias_hh_l1_reverse True\n",
      "attn_step1.weight True\n",
      "attn_step1.bias True\n",
      "attn_step2.weight True\n",
      "attn_step2.bias True\n",
      "fc1.weight True\n",
      "fc1.bias True\n",
      "fc2.weight True\n",
      "fc2.bias True\n",
      "============================================================\n",
      "\n",
      "\n",
      "\n",
      "\u001b[92m2020-09-10 18:31:57\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m******************************\u001b[0m\n",
      "\u001b[92m2020-09-10 18:31:57\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m**** (Bi-directional) GRU ****\u001b[0m\n",
      "\u001b[92m2020-09-10 18:31:57\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m******************************\u001b[0m\n",
      "\u001b[92m2020-09-10 18:31:57\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mNumber of batches: 32\u001b[0m\n",
      "\u001b[92m2020-09-10 18:31:57\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mNumber of epochs: 20\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "629702d41d164fdeb5c6a5ab8afb8e4c",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=20.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=32.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n",
      "\n",
      "====================\n",
      "Total number of params: 684843\n",
      "\n",
      "two_parallel_rnns (\n",
      "  (emb): Embedding(7542, 60), weights=((7542, 60),), parameters=452520\n",
      "  (rnn_1): GRU(60, 60, num_layers=2, dropout=0.01, bidirectional=True), weights=((180, 60), (180, 60), (180,), (180,), (180, 60), (180, 60), (180,), (180,), (180, 120), (180, 60), (180,), (180,), (180, 120), (180, 60), (180,), (180,)), parameters=109440\n",
      "  (attn_step1): Linear(in_features=120, out_features=60, bias=True), weights=((60, 120), (60,)), parameters=7260\n",
      "  (attn_step2): Linear(in_features=60, out_features=1, bias=True), weights=((1, 60), (1,)), parameters=61\n",
      "  (fc1): Linear(in_features=960, out_features=120, bias=True), weights=((120, 960), (120,)), parameters=115320\n",
      "  (fc2): Linear(in_features=120, out_features=2, bias=True), weights=((2, 120), (2,)), parameters=242\n",
      ")\n",
      "====================\n",
      "\n",
      "\n",
      "\u001b[92m2020-09-10 18:31:59\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_18:31:59 -- Epoch: 1/20; Train; loss: 1.002; acc: 0.548; precision: 0.548, recall: 0.551, macrof1: 0.548, weightedf1: 0.548\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:32:07\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_18:32:07 -- Epoch: 1/20; Valid; loss: 0.675; acc: 0.628; precision: 0.603, recall: 0.755, macrof1: 0.622, weightedf1: 0.622\u001b[0m\n",
      "\u001b[92m2020-09-10 18:32:07\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=32.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:32:09\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_18:32:09 -- Epoch: 2/20; Train; loss: 0.556; acc: 0.708; precision: 0.686, recall: 0.765, macrof1: 0.707, weightedf1: 0.707\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:32:17\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_18:32:17 -- Epoch: 2/20; Valid; loss: 0.584; acc: 0.695; precision: 0.679, recall: 0.743, macrof1: 0.695, weightedf1: 0.695\u001b[0m\n",
      "\u001b[92m2020-09-10 18:32:17\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=32.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:32:19\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_18:32:19 -- Epoch: 3/20; Train; loss: 0.437; acc: 0.805; precision: 0.798, recall: 0.816, macrof1: 0.805, weightedf1: 0.805\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:32:27\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_18:32:27 -- Epoch: 3/20; Valid; loss: 0.540; acc: 0.744; precision: 0.743, recall: 0.746, macrof1: 0.744, weightedf1: 0.744\u001b[0m\n",
      "\u001b[92m2020-09-10 18:32:27\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=32.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:32:29\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_18:32:29 -- Epoch: 4/20; Train; loss: 0.323; acc: 0.871; precision: 0.875, recall: 0.864, macrof1: 0.870, weightedf1: 0.870\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:32:37\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_18:32:37 -- Epoch: 4/20; Valid; loss: 0.506; acc: 0.777; precision: 0.779, recall: 0.773, macrof1: 0.777, weightedf1: 0.777\u001b[0m\n",
      "\u001b[92m2020-09-10 18:32:37\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=32.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:32:39\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_18:32:39 -- Epoch: 5/20; Train; loss: 0.221; acc: 0.922; precision: 0.931, recall: 0.911, macrof1: 0.922, weightedf1: 0.922\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:32:47\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_18:32:47 -- Epoch: 5/20; Valid; loss: 0.491; acc: 0.796; precision: 0.794, recall: 0.800, macrof1: 0.796, weightedf1: 0.796\u001b[0m\n",
      "\u001b[92m2020-09-10 18:32:47\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=32.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:32:49\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_18:32:49 -- Epoch: 6/20; Train; loss: 0.140; acc: 0.962; precision: 0.967, recall: 0.957, macrof1: 0.962, weightedf1: 0.962\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:32:57\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_18:32:57 -- Epoch: 6/20; Valid; loss: 0.503; acc: 0.809; precision: 0.801, recall: 0.822, macrof1: 0.809, weightedf1: 0.809\u001b[0m\n",
      "\u001b[92m2020-09-10 18:32:57\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model (early stopped) with least valid loss (checkpoint: 5) at ./models/wikigaz_en_ft_ocr_gru_v002_n2000/wikigaz_en_ft_ocr_gru_v002_n2000.model\u001b[0m\n",
      "\u001b[92m2020-09-10 18:32:57\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n",
      "\u001b[92m2020-09-10 18:32:57\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mEarly stopping at epoch: 6, selected epoch: 5\u001b[0m\n",
      "\n",
      "\n",
      "\n",
      "\n",
      "====================\n",
      "User time: 60.4684\n",
      "====================\n"
     ]
    }
   ],
   "source": [
    "from DeezyMatch import finetune as dm_finetune\n",
    "\n",
    "n_ft_examples = 2000\n",
    "\n",
    "# fine-tune a pretrained model stored at pretrained_model_path and pretrained_vocab_path \n",
    "dm_finetune(input_file_path=\"./inputs/input_dfm_gru_model_B.yaml\", \n",
    "            dataset_path=\"./dataset/ocr_trainval.txt\", \n",
    "            model_name=f\"wikigaz_en_ft_ocr_gru_v002_n{n_ft_examples}\",\n",
    "            pretrained_model_path=\"./models/wikigaz_en_gru_001/wikigaz_en_gru_001.model\", \n",
    "            pretrained_vocab_path=\"./models/wikigaz_en_gru_001/wikigaz_en_gru_001.vocab\",\n",
    "            n_train_examples=n_ft_examples\n",
    "           )"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 49,
   "metadata": {
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:32:57\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mread input file: ./inputs/input_dfm_gru_model_B.yaml\u001b[0m\n",
      "\u001b[92m2020-09-10 18:32:58\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32mpytorch will use: cuda:1\u001b[0m\n",
      "\u001b[92m2020-09-10 18:32:58\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mread CSV file: ./dataset/ocr_trainval.txt\u001b[0m\n",
      "\u001b[92m2020-09-10 18:32:58\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32mnumber of labels, True: 42301 and False: 42302\u001b[0m\n",
      "\u001b[92m2020-09-10 18:32:58\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mSplitting the Dataset\u001b[0m\n",
      "\u001b[92m2020-09-10 18:32:58\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mfinish splitting the Dataset. User time: 0.05280113220214844\u001b[0m\n",
      "\u001b[92m2020-09-10 18:32:58\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msplits are as follow:\n",
      "not_assigned    55221\n",
      "val             25380\n",
      "train            4000\n",
      "test                2\n",
      "Name: split, dtype: int64\u001b[0m\n",
      "\u001b[92m2020-09-10 18:32:58\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mstart creating a lookup table and convert characters to indices\u001b[0m\n",
      "\u001b[92m2020-09-10 18:32:58\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32m-- create vocabulary\u001b[0m\n",
      "\u001b[92m2020-09-10 18:32:59\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32m-- convert tokens to indices\u001b[0m\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "length s1:   0%|          | 0/25380 [00:00<?, ?it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:33:00\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mskipping 0 lines\u001b[0m\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "                                                                     \r"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n",
      "============================================================\n",
      "List all parameters in the model\n",
      "============================================================\n",
      "emb.weight False\n",
      "rnn_1.weight_ih_l0 True\n",
      "rnn_1.weight_hh_l0 True\n",
      "rnn_1.bias_ih_l0 True\n",
      "rnn_1.bias_hh_l0 True\n",
      "rnn_1.weight_ih_l0_reverse True\n",
      "rnn_1.weight_hh_l0_reverse True\n",
      "rnn_1.bias_ih_l0_reverse True\n",
      "rnn_1.bias_hh_l0_reverse True\n",
      "rnn_1.weight_ih_l1 True\n",
      "rnn_1.weight_hh_l1 True\n",
      "rnn_1.bias_ih_l1 True\n",
      "rnn_1.bias_hh_l1 True\n",
      "rnn_1.weight_ih_l1_reverse True\n",
      "rnn_1.weight_hh_l1_reverse True\n",
      "rnn_1.bias_ih_l1_reverse True\n",
      "rnn_1.bias_hh_l1_reverse True\n",
      "attn_step1.weight True\n",
      "attn_step1.bias True\n",
      "attn_step2.weight True\n",
      "attn_step2.bias True\n",
      "fc1.weight True\n",
      "fc1.bias True\n",
      "fc2.weight True\n",
      "fc2.bias True\n",
      "============================================================\n",
      "\n",
      "\n",
      "\n",
      "\u001b[92m2020-09-10 18:33:01\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m******************************\u001b[0m\n",
      "\u001b[92m2020-09-10 18:33:01\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m**** (Bi-directional) GRU ****\u001b[0m\n",
      "\u001b[92m2020-09-10 18:33:01\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m******************************\u001b[0m\n",
      "\u001b[92m2020-09-10 18:33:01\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mNumber of batches: 63\u001b[0m\n",
      "\u001b[92m2020-09-10 18:33:01\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mNumber of epochs: 20\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "6c1934384a6c4d83ac16e4d3f0298b38",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=20.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=63.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n",
      "\n",
      "====================\n",
      "Total number of params: 684843\n",
      "\n",
      "two_parallel_rnns (\n",
      "  (emb): Embedding(7542, 60), weights=((7542, 60),), parameters=452520\n",
      "  (rnn_1): GRU(60, 60, num_layers=2, dropout=0.01, bidirectional=True), weights=((180, 60), (180, 60), (180,), (180,), (180, 60), (180, 60), (180,), (180,), (180, 120), (180, 60), (180,), (180,), (180, 120), (180, 60), (180,), (180,)), parameters=109440\n",
      "  (attn_step1): Linear(in_features=120, out_features=60, bias=True), weights=((60, 120), (60,)), parameters=7260\n",
      "  (attn_step2): Linear(in_features=60, out_features=1, bias=True), weights=((1, 60), (1,)), parameters=61\n",
      "  (fc1): Linear(in_features=960, out_features=120, bias=True), weights=((120, 960), (120,)), parameters=115320\n",
      "  (fc2): Linear(in_features=120, out_features=2, bias=True), weights=((2, 120), (2,)), parameters=242\n",
      ")\n",
      "====================\n",
      "\n",
      "\n",
      "\u001b[92m2020-09-10 18:33:04\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_18:33:04 -- Epoch: 1/20; Train; loss: 0.804; acc: 0.617; precision: 0.610, recall: 0.649, macrof1: 0.616, weightedf1: 0.616\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:33:12\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_18:33:12 -- Epoch: 1/20; Valid; loss: 0.582; acc: 0.692; precision: 0.670, recall: 0.757, macrof1: 0.691, weightedf1: 0.691\u001b[0m\n",
      "\u001b[92m2020-09-10 18:33:13\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=63.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:33:16\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_18:33:16 -- Epoch: 2/20; Train; loss: 0.472; acc: 0.770; precision: 0.757, recall: 0.794, macrof1: 0.770, weightedf1: 0.770\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:33:24\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_18:33:24 -- Epoch: 2/20; Valid; loss: 0.484; acc: 0.775; precision: 0.812, recall: 0.717, macrof1: 0.775, weightedf1: 0.775\u001b[0m\n",
      "\u001b[92m2020-09-10 18:33:24\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=63.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:33:28\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_18:33:28 -- Epoch: 3/20; Train; loss: 0.325; acc: 0.868; precision: 0.876, recall: 0.858, macrof1: 0.868, weightedf1: 0.868\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:33:38\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_18:33:38 -- Epoch: 3/20; Valid; loss: 0.413; acc: 0.824; precision: 0.819, recall: 0.831, macrof1: 0.824, weightedf1: 0.824\u001b[0m\n",
      "\u001b[92m2020-09-10 18:33:38\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=63.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:33:44\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_18:33:44 -- Epoch: 4/20; Train; loss: 0.208; acc: 0.930; precision: 0.928, recall: 0.931, macrof1: 0.930, weightedf1: 0.930\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:33:57\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_18:33:57 -- Epoch: 4/20; Valid; loss: 0.391; acc: 0.840; precision: 0.852, recall: 0.822, macrof1: 0.840, weightedf1: 0.840\u001b[0m\n",
      "\u001b[92m2020-09-10 18:33:57\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=63.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:34:03\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_18:34:03 -- Epoch: 5/20; Train; loss: 0.125; acc: 0.966; precision: 0.965, recall: 0.967, macrof1: 0.966, weightedf1: 0.966\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:34:15\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_18:34:15 -- Epoch: 5/20; Valid; loss: 0.391; acc: 0.848; precision: 0.850, recall: 0.845, macrof1: 0.848, weightedf1: 0.848\u001b[0m\n",
      "\u001b[92m2020-09-10 18:34:15\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=63.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:34:21\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_18:34:21 -- Epoch: 6/20; Train; loss: 0.071; acc: 0.987; precision: 0.987, recall: 0.987, macrof1: 0.987, weightedf1: 0.987\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:34:33\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_18:34:33 -- Epoch: 6/20; Valid; loss: 0.410; acc: 0.853; precision: 0.837, recall: 0.878, macrof1: 0.853, weightedf1: 0.853\u001b[0m\n",
      "\u001b[92m2020-09-10 18:34:33\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model (early stopped) with least valid loss (checkpoint: 5) at ./models/wikigaz_en_ft_ocr_gru_v002_n4000/wikigaz_en_ft_ocr_gru_v002_n4000.model\u001b[0m\n",
      "\u001b[92m2020-09-10 18:34:33\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n",
      "\u001b[92m2020-09-10 18:34:33\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mEarly stopping at epoch: 6, selected epoch: 5\u001b[0m\n",
      "\n",
      "\n",
      "\n",
      "\n",
      "====================\n",
      "User time: 92.7731\n",
      "====================\n"
     ]
    }
   ],
   "source": [
    "from DeezyMatch import finetune as dm_finetune\n",
    "\n",
    "n_ft_examples = 4000\n",
    "\n",
    "# fine-tune a pretrained model stored at pretrained_model_path and pretrained_vocab_path \n",
    "dm_finetune(input_file_path=\"./inputs/input_dfm_gru_model_B.yaml\", \n",
    "            dataset_path=\"./dataset/ocr_trainval.txt\", \n",
    "            model_name=f\"wikigaz_en_ft_ocr_gru_v002_n{n_ft_examples}\",\n",
    "            pretrained_model_path=\"./models/wikigaz_en_gru_001/wikigaz_en_gru_001.model\", \n",
    "            pretrained_vocab_path=\"./models/wikigaz_en_gru_001/wikigaz_en_gru_001.vocab\",\n",
    "            n_train_examples=n_ft_examples\n",
    "           )"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 50,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:34:33\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mread input file: ./inputs/input_dfm_gru_model_B.yaml\u001b[0m\n",
      "\u001b[92m2020-09-10 18:34:33\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32mpytorch will use: cuda:1\u001b[0m\n",
      "\u001b[92m2020-09-10 18:34:33\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mread CSV file: ./dataset/ocr_trainval.txt\u001b[0m\n",
      "\u001b[92m2020-09-10 18:34:34\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32mnumber of labels, True: 42301 and False: 42302\u001b[0m\n",
      "\u001b[92m2020-09-10 18:34:34\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mSplitting the Dataset\u001b[0m\n",
      "\u001b[92m2020-09-10 18:34:34\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mfinish splitting the Dataset. User time: 0.0547635555267334\u001b[0m\n",
      "\u001b[92m2020-09-10 18:34:34\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msplits are as follow:\n",
      "not_assigned    51221\n",
      "val             25380\n",
      "train            8000\n",
      "test                2\n",
      "Name: split, dtype: int64\u001b[0m\n",
      "\u001b[92m2020-09-10 18:34:34\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mstart creating a lookup table and convert characters to indices\u001b[0m\n",
      "\u001b[92m2020-09-10 18:34:34\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32m-- create vocabulary\u001b[0m\n",
      "\u001b[92m2020-09-10 18:34:35\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32m-- convert tokens to indices\u001b[0m\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "s2 padding:   0%|          | 0/8000 [00:00<?, ?it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:34:36\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mskipping 0 lines\u001b[0m\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "                                                                     \r"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n",
      "============================================================\n",
      "List all parameters in the model\n",
      "============================================================\n",
      "emb.weight False\n",
      "rnn_1.weight_ih_l0 True\n",
      "rnn_1.weight_hh_l0 True\n",
      "rnn_1.bias_ih_l0 True\n",
      "rnn_1.bias_hh_l0 True\n",
      "rnn_1.weight_ih_l0_reverse True\n",
      "rnn_1.weight_hh_l0_reverse True\n",
      "rnn_1.bias_ih_l0_reverse True\n",
      "rnn_1.bias_hh_l0_reverse True\n",
      "rnn_1.weight_ih_l1 True\n",
      "rnn_1.weight_hh_l1 True\n",
      "rnn_1.bias_ih_l1 True\n",
      "rnn_1.bias_hh_l1 True\n",
      "rnn_1.weight_ih_l1_reverse True\n",
      "rnn_1.weight_hh_l1_reverse True\n",
      "rnn_1.bias_ih_l1_reverse True\n",
      "rnn_1.bias_hh_l1_reverse True\n",
      "attn_step1.weight True\n",
      "attn_step1.bias True\n",
      "attn_step2.weight True\n",
      "attn_step2.bias True\n",
      "fc1.weight True\n",
      "fc1.bias True\n",
      "fc2.weight True\n",
      "fc2.bias True\n",
      "============================================================\n",
      "\n",
      "\n",
      "\n",
      "\u001b[92m2020-09-10 18:34:37\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m******************************\u001b[0m\n",
      "\u001b[92m2020-09-10 18:34:37\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m**** (Bi-directional) GRU ****\u001b[0m\n",
      "\u001b[92m2020-09-10 18:34:37\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m******************************\u001b[0m\n",
      "\u001b[92m2020-09-10 18:34:37\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mNumber of batches: 125\u001b[0m\n",
      "\u001b[92m2020-09-10 18:34:37\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mNumber of epochs: 20\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "3e0112f29ebf4a06a514ed811ae65443",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=20.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=125.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n",
      "\n",
      "====================\n",
      "Total number of params: 684843\n",
      "\n",
      "two_parallel_rnns (\n",
      "  (emb): Embedding(7542, 60), weights=((7542, 60),), parameters=452520\n",
      "  (rnn_1): GRU(60, 60, num_layers=2, dropout=0.01, bidirectional=True), weights=((180, 60), (180, 60), (180,), (180,), (180, 60), (180, 60), (180,), (180,), (180, 120), (180, 60), (180,), (180,), (180, 120), (180, 60), (180,), (180,)), parameters=109440\n",
      "  (attn_step1): Linear(in_features=120, out_features=60, bias=True), weights=((60, 120), (60,)), parameters=7260\n",
      "  (attn_step2): Linear(in_features=60, out_features=1, bias=True), weights=((1, 60), (1,)), parameters=61\n",
      "  (fc1): Linear(in_features=960, out_features=120, bias=True), weights=((120, 960), (120,)), parameters=115320\n",
      "  (fc2): Linear(in_features=120, out_features=2, bias=True), weights=((2, 120), (2,)), parameters=242\n",
      ")\n",
      "====================\n",
      "\n",
      "\n",
      "\u001b[92m2020-09-10 18:34:48\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_18:34:48 -- Epoch: 1/20; Train; loss: 0.664; acc: 0.671; precision: 0.660, recall: 0.705, macrof1: 0.671, weightedf1: 0.671\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:35:00\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_18:35:00 -- Epoch: 1/20; Valid; loss: 0.459; acc: 0.789; precision: 0.807, recall: 0.758, macrof1: 0.788, weightedf1: 0.788\u001b[0m\n",
      "\u001b[92m2020-09-10 18:35:00\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=125.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:35:11\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_18:35:11 -- Epoch: 2/20; Train; loss: 0.351; acc: 0.849; precision: 0.849, recall: 0.849, macrof1: 0.849, weightedf1: 0.849\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:35:23\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_18:35:23 -- Epoch: 2/20; Valid; loss: 0.339; acc: 0.861; precision: 0.859, recall: 0.862, macrof1: 0.861, weightedf1: 0.861\u001b[0m\n",
      "\u001b[92m2020-09-10 18:35:23\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=125.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:35:35\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_18:35:35 -- Epoch: 3/20; Train; loss: 0.215; acc: 0.920; precision: 0.918, recall: 0.922, macrof1: 0.920, weightedf1: 0.920\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:35:47\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_18:35:47 -- Epoch: 3/20; Valid; loss: 0.306; acc: 0.878; precision: 0.885, recall: 0.868, macrof1: 0.878, weightedf1: 0.878\u001b[0m\n",
      "\u001b[92m2020-09-10 18:35:47\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=125.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:35:58\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_18:35:58 -- Epoch: 4/20; Train; loss: 0.129; acc: 0.960; precision: 0.958, recall: 0.961, macrof1: 0.959, weightedf1: 0.959\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:36:11\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_18:36:11 -- Epoch: 4/20; Valid; loss: 0.299; acc: 0.886; precision: 0.882, recall: 0.891, macrof1: 0.886, weightedf1: 0.886\u001b[0m\n",
      "\u001b[92m2020-09-10 18:36:11\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=125.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:36:23\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_18:36:23 -- Epoch: 5/20; Train; loss: 0.070; acc: 0.982; precision: 0.980, recall: 0.985, macrof1: 0.982, weightedf1: 0.982\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:36:34\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_18:36:34 -- Epoch: 5/20; Valid; loss: 0.319; acc: 0.890; precision: 0.882, recall: 0.902, macrof1: 0.890, weightedf1: 0.890\u001b[0m\n",
      "\u001b[92m2020-09-10 18:36:34\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model (early stopped) with least valid loss (checkpoint: 4) at ./models/wikigaz_en_ft_ocr_gru_v002_n8000/wikigaz_en_ft_ocr_gru_v002_n8000.model\u001b[0m\n",
      "\u001b[92m2020-09-10 18:36:34\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n",
      "\u001b[92m2020-09-10 18:36:34\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mEarly stopping at epoch: 5, selected epoch: 4\u001b[0m\n",
      "\n",
      "\n",
      "\n",
      "\n",
      "====================\n",
      "User time: 117.6755\n",
      "====================\n"
     ]
    }
   ],
   "source": [
    "from DeezyMatch import finetune as dm_finetune\n",
    "\n",
    "n_ft_examples = 8000\n",
    "\n",
    "# fine-tune a pretrained model stored at pretrained_model_path and pretrained_vocab_path \n",
    "dm_finetune(input_file_path=\"./inputs/input_dfm_gru_model_B.yaml\", \n",
    "            dataset_path=\"./dataset/ocr_trainval.txt\", \n",
    "            model_name=f\"wikigaz_en_ft_ocr_gru_v002_n{n_ft_examples}\",\n",
    "            pretrained_model_path=\"./models/wikigaz_en_gru_001/wikigaz_en_gru_001.model\", \n",
    "            pretrained_vocab_path=\"./models/wikigaz_en_gru_001/wikigaz_en_gru_001.vocab\",\n",
    "            n_train_examples=n_ft_examples\n",
    "           )"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 51,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:36:34\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mread input file: ./inputs/input_dfm_gru_model_B.yaml\u001b[0m\n",
      "\u001b[92m2020-09-10 18:36:34\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32mpytorch will use: cuda:1\u001b[0m\n",
      "\u001b[92m2020-09-10 18:36:34\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mread CSV file: ./dataset/ocr_trainval.txt\u001b[0m\n",
      "\u001b[92m2020-09-10 18:36:35\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32mnumber of labels, True: 42301 and False: 42302\u001b[0m\n",
      "\u001b[92m2020-09-10 18:36:35\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mSplitting the Dataset\u001b[0m\n",
      "\u001b[92m2020-09-10 18:36:35\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mfinish splitting the Dataset. User time: 0.05351972579956055\u001b[0m\n",
      "\u001b[92m2020-09-10 18:36:35\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msplits are as follow:\n",
      "not_assigned    43221\n",
      "val             25380\n",
      "train           16000\n",
      "test                2\n",
      "Name: split, dtype: int64\u001b[0m\n",
      "\u001b[92m2020-09-10 18:36:35\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mstart creating a lookup table and convert characters to indices\u001b[0m\n",
      "\u001b[92m2020-09-10 18:36:35\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32m-- create vocabulary\u001b[0m\n",
      "\u001b[92m2020-09-10 18:36:36\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32m-- convert tokens to indices\u001b[0m\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "s1 padding:   0%|          | 0/16000 [00:00<?, ?it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:36:37\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mskipping 0 lines\u001b[0m\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "                                                                     \r"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n",
      "============================================================\n",
      "List all parameters in the model\n",
      "============================================================\n",
      "emb.weight False\n",
      "rnn_1.weight_ih_l0 True\n",
      "rnn_1.weight_hh_l0 True\n",
      "rnn_1.bias_ih_l0 True\n",
      "rnn_1.bias_hh_l0 True\n",
      "rnn_1.weight_ih_l0_reverse True\n",
      "rnn_1.weight_hh_l0_reverse True\n",
      "rnn_1.bias_ih_l0_reverse True\n",
      "rnn_1.bias_hh_l0_reverse True\n",
      "rnn_1.weight_ih_l1 True\n",
      "rnn_1.weight_hh_l1 True\n",
      "rnn_1.bias_ih_l1 True\n",
      "rnn_1.bias_hh_l1 True\n",
      "rnn_1.weight_ih_l1_reverse True\n",
      "rnn_1.weight_hh_l1_reverse True\n",
      "rnn_1.bias_ih_l1_reverse True\n",
      "rnn_1.bias_hh_l1_reverse True\n",
      "attn_step1.weight True\n",
      "attn_step1.bias True\n",
      "attn_step2.weight True\n",
      "attn_step2.bias True\n",
      "fc1.weight True\n",
      "fc1.bias True\n",
      "fc2.weight True\n",
      "fc2.bias True\n",
      "============================================================\n",
      "\n",
      "\n",
      "\n",
      "\u001b[92m2020-09-10 18:36:38\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m******************************\u001b[0m\n",
      "\u001b[92m2020-09-10 18:36:38\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m**** (Bi-directional) GRU ****\u001b[0m\n",
      "\u001b[92m2020-09-10 18:36:38\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m******************************\u001b[0m\n",
      "\u001b[92m2020-09-10 18:36:38\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mNumber of batches: 250\u001b[0m\n",
      "\u001b[92m2020-09-10 18:36:38\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mNumber of epochs: 20\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "9d92346481a74e519ef0323a566d245a",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=20.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=250.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n",
      "\n",
      "====================\n",
      "Total number of params: 684843\n",
      "\n",
      "two_parallel_rnns (\n",
      "  (emb): Embedding(7542, 60), weights=((7542, 60),), parameters=452520\n",
      "  (rnn_1): GRU(60, 60, num_layers=2, dropout=0.01, bidirectional=True), weights=((180, 60), (180, 60), (180,), (180,), (180, 60), (180, 60), (180,), (180,), (180, 120), (180, 60), (180,), (180,), (180, 120), (180, 60), (180,), (180,)), parameters=109440\n",
      "  (attn_step1): Linear(in_features=120, out_features=60, bias=True), weights=((60, 120), (60,)), parameters=7260\n",
      "  (attn_step2): Linear(in_features=60, out_features=1, bias=True), weights=((1, 60), (1,)), parameters=61\n",
      "  (fc1): Linear(in_features=960, out_features=120, bias=True), weights=((120, 960), (120,)), parameters=115320\n",
      "  (fc2): Linear(in_features=120, out_features=2, bias=True), weights=((2, 120), (2,)), parameters=242\n",
      ")\n",
      "====================\n",
      "\n",
      "\n",
      "\u001b[92m2020-09-10 18:37:00\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_18:37:00 -- Epoch: 1/20; Train; loss: 0.526; acc: 0.755; precision: 0.747, recall: 0.774, macrof1: 0.755, weightedf1: 0.755\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:37:12\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_18:37:12 -- Epoch: 1/20; Valid; loss: 0.340; acc: 0.862; precision: 0.890, recall: 0.825, macrof1: 0.861, weightedf1: 0.861\u001b[0m\n",
      "\u001b[92m2020-09-10 18:37:12\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=250.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:37:35\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_18:37:35 -- Epoch: 2/20; Train; loss: 0.244; acc: 0.904; precision: 0.898, recall: 0.910, macrof1: 0.904, weightedf1: 0.904\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:37:48\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_18:37:48 -- Epoch: 2/20; Valid; loss: 0.256; acc: 0.899; precision: 0.897, recall: 0.902, macrof1: 0.899, weightedf1: 0.899\u001b[0m\n",
      "\u001b[92m2020-09-10 18:37:48\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=250.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:38:10\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_18:38:10 -- Epoch: 3/20; Train; loss: 0.145; acc: 0.949; precision: 0.945, recall: 0.953, macrof1: 0.949, weightedf1: 0.949\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:38:23\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_18:38:23 -- Epoch: 3/20; Valid; loss: 0.235; acc: 0.911; precision: 0.903, recall: 0.921, macrof1: 0.911, weightedf1: 0.911\u001b[0m\n",
      "\u001b[92m2020-09-10 18:38:23\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=250.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:38:45\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_18:38:45 -- Epoch: 4/20; Train; loss: 0.084; acc: 0.975; precision: 0.971, recall: 0.978, macrof1: 0.975, weightedf1: 0.975\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:38:57\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_18:38:57 -- Epoch: 4/20; Valid; loss: 0.242; acc: 0.914; precision: 0.916, recall: 0.912, macrof1: 0.914, weightedf1: 0.914\u001b[0m\n",
      "\u001b[92m2020-09-10 18:38:57\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model (early stopped) with least valid loss (checkpoint: 3) at ./models/wikigaz_en_ft_ocr_gru_v002_n16000/wikigaz_en_ft_ocr_gru_v002_n16000.model\u001b[0m\n",
      "\u001b[92m2020-09-10 18:38:57\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n",
      "\u001b[92m2020-09-10 18:38:57\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mEarly stopping at epoch: 4, selected epoch: 3\u001b[0m\n",
      "\n",
      "\n",
      "\n",
      "\n",
      "====================\n",
      "User time: 139.5836\n",
      "====================\n"
     ]
    }
   ],
   "source": [
    "from DeezyMatch import finetune as dm_finetune\n",
    "\n",
    "n_ft_examples = 16000\n",
    "\n",
    "# fine-tune a pretrained model stored at pretrained_model_path and pretrained_vocab_path \n",
    "dm_finetune(input_file_path=\"./inputs/input_dfm_gru_model_B.yaml\", \n",
    "            dataset_path=\"./dataset/ocr_trainval.txt\", \n",
    "            model_name=f\"wikigaz_en_ft_ocr_gru_v002_n{n_ft_examples}\",\n",
    "            pretrained_model_path=\"./models/wikigaz_en_gru_001/wikigaz_en_gru_001.model\", \n",
    "            pretrained_vocab_path=\"./models/wikigaz_en_gru_001/wikigaz_en_gru_001.vocab\",\n",
    "            n_train_examples=n_ft_examples\n",
    "           )"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 52,
   "metadata": {
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:38:57\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mread input file: ./inputs/input_dfm_gru_model_B.yaml\u001b[0m\n",
      "\u001b[92m2020-09-10 18:38:57\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32mpytorch will use: cuda:1\u001b[0m\n",
      "\u001b[92m2020-09-10 18:38:57\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mread CSV file: ./dataset/ocr_trainval.txt\u001b[0m\n",
      "\u001b[92m2020-09-10 18:38:58\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32mnumber of labels, True: 42301 and False: 42302\u001b[0m\n",
      "\u001b[92m2020-09-10 18:38:58\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mSplitting the Dataset\u001b[0m\n",
      "\u001b[92m2020-09-10 18:38:58\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mfinish splitting the Dataset. User time: 0.06592154502868652\u001b[0m\n",
      "\u001b[92m2020-09-10 18:38:58\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msplits are as follow:\n",
      "train           32000\n",
      "not_assigned    27221\n",
      "val             25380\n",
      "test                2\n",
      "Name: split, dtype: int64\u001b[0m\n",
      "\u001b[92m2020-09-10 18:38:58\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mstart creating a lookup table and convert characters to indices\u001b[0m\n",
      "\u001b[92m2020-09-10 18:38:58\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32m-- create vocabulary\u001b[0m\n",
      "\u001b[92m2020-09-10 18:38:59\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32m-- convert tokens to indices\u001b[0m\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "s1 padding:   0%|          | 0/32000 [00:00<?, ?it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:39:00\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mskipping 0 lines\u001b[0m\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "                                                                     \r"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n",
      "============================================================\n",
      "List all parameters in the model\n",
      "============================================================\n",
      "emb.weight False\n",
      "rnn_1.weight_ih_l0 True\n",
      "rnn_1.weight_hh_l0 True\n",
      "rnn_1.bias_ih_l0 True\n",
      "rnn_1.bias_hh_l0 True\n",
      "rnn_1.weight_ih_l0_reverse True\n",
      "rnn_1.weight_hh_l0_reverse True\n",
      "rnn_1.bias_ih_l0_reverse True\n",
      "rnn_1.bias_hh_l0_reverse True\n",
      "rnn_1.weight_ih_l1 True\n",
      "rnn_1.weight_hh_l1 True\n",
      "rnn_1.bias_ih_l1 True\n",
      "rnn_1.bias_hh_l1 True\n",
      "rnn_1.weight_ih_l1_reverse True\n",
      "rnn_1.weight_hh_l1_reverse True\n",
      "rnn_1.bias_ih_l1_reverse True\n",
      "rnn_1.bias_hh_l1_reverse True\n",
      "attn_step1.weight True\n",
      "attn_step1.bias True\n",
      "attn_step2.weight True\n",
      "attn_step2.bias True\n",
      "fc1.weight True\n",
      "fc1.bias True\n",
      "fc2.weight True\n",
      "fc2.bias True\n",
      "============================================================\n",
      "\n",
      "\n",
      "\n",
      "\u001b[92m2020-09-10 18:39:01\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m******************************\u001b[0m\n",
      "\u001b[92m2020-09-10 18:39:01\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m**** (Bi-directional) GRU ****\u001b[0m\n",
      "\u001b[92m2020-09-10 18:39:01\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m******************************\u001b[0m\n",
      "\u001b[92m2020-09-10 18:39:01\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mNumber of batches: 500\u001b[0m\n",
      "\u001b[92m2020-09-10 18:39:01\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mNumber of epochs: 20\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "e1900856d36846e2ab1ea952cd84be8c",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=20.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=500.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n",
      "\n",
      "====================\n",
      "Total number of params: 684843\n",
      "\n",
      "two_parallel_rnns (\n",
      "  (emb): Embedding(7542, 60), weights=((7542, 60),), parameters=452520\n",
      "  (rnn_1): GRU(60, 60, num_layers=2, dropout=0.01, bidirectional=True), weights=((180, 60), (180, 60), (180,), (180,), (180, 60), (180, 60), (180,), (180,), (180, 120), (180, 60), (180,), (180,), (180, 120), (180, 60), (180,), (180,)), parameters=109440\n",
      "  (attn_step1): Linear(in_features=120, out_features=60, bias=True), weights=((60, 120), (60,)), parameters=7260\n",
      "  (attn_step2): Linear(in_features=60, out_features=1, bias=True), weights=((1, 60), (1,)), parameters=61\n",
      "  (fc1): Linear(in_features=960, out_features=120, bias=True), weights=((120, 960), (120,)), parameters=115320\n",
      "  (fc2): Linear(in_features=120, out_features=2, bias=True), weights=((2, 120), (2,)), parameters=242\n",
      ")\n",
      "====================\n",
      "\n",
      "\n",
      "\u001b[92m2020-09-10 18:39:45\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_18:39:45 -- Epoch: 1/20; Train; loss: 0.405; acc: 0.819; precision: 0.812, recall: 0.830, macrof1: 0.819, weightedf1: 0.819\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:39:58\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_18:39:58 -- Epoch: 1/20; Valid; loss: 0.259; acc: 0.895; precision: 0.864, recall: 0.937, macrof1: 0.895, weightedf1: 0.895\u001b[0m\n",
      "\u001b[92m2020-09-10 18:39:58\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=500.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:40:43\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_18:40:43 -- Epoch: 2/20; Train; loss: 0.179; acc: 0.933; precision: 0.927, recall: 0.940, macrof1: 0.933, weightedf1: 0.933\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:40:56\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_18:40:56 -- Epoch: 2/20; Valid; loss: 0.203; acc: 0.922; precision: 0.896, recall: 0.956, macrof1: 0.922, weightedf1: 0.922\u001b[0m\n",
      "\u001b[92m2020-09-10 18:40:56\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=500.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:41:28\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_18:41:28 -- Epoch: 3/20; Train; loss: 0.105; acc: 0.964; precision: 0.958, recall: 0.970, macrof1: 0.964, weightedf1: 0.964\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:41:36\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_18:41:36 -- Epoch: 3/20; Valid; loss: 0.190; acc: 0.932; precision: 0.931, recall: 0.933, macrof1: 0.932, weightedf1: 0.932\u001b[0m\n",
      "\u001b[92m2020-09-10 18:41:36\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=500.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:42:05\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_18:42:05 -- Epoch: 4/20; Train; loss: 0.063; acc: 0.980; precision: 0.977, recall: 0.984, macrof1: 0.980, weightedf1: 0.980\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:42:13\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_18:42:13 -- Epoch: 4/20; Valid; loss: 0.192; acc: 0.937; precision: 0.934, recall: 0.940, macrof1: 0.937, weightedf1: 0.937\u001b[0m\n",
      "\u001b[92m2020-09-10 18:42:13\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model (early stopped) with least valid loss (checkpoint: 3) at ./models/wikigaz_en_ft_ocr_gru_v002_n32000/wikigaz_en_ft_ocr_gru_v002_n32000.model\u001b[0m\n",
      "\u001b[92m2020-09-10 18:42:13\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n",
      "\u001b[92m2020-09-10 18:42:13\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mEarly stopping at epoch: 4, selected epoch: 3\u001b[0m\n",
      "\n",
      "\n",
      "\n",
      "\n",
      "====================\n",
      "User time: 192.2439\n",
      "====================\n"
     ]
    }
   ],
   "source": [
    "from DeezyMatch import finetune as dm_finetune\n",
    "\n",
    "n_ft_examples = 32000\n",
    "\n",
    "# fine-tune a pretrained model stored at pretrained_model_path and pretrained_vocab_path \n",
    "dm_finetune(input_file_path=\"./inputs/input_dfm_gru_model_B.yaml\", \n",
    "            dataset_path=\"./dataset/ocr_trainval.txt\", \n",
    "            model_name=f\"wikigaz_en_ft_ocr_gru_v002_n{n_ft_examples}\",\n",
    "            pretrained_model_path=\"./models/wikigaz_en_gru_001/wikigaz_en_gru_001.model\", \n",
    "            pretrained_vocab_path=\"./models/wikigaz_en_gru_001/wikigaz_en_gru_001.vocab\",\n",
    "            n_train_examples=n_ft_examples\n",
    "           )"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 53,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:42:13\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mread input file: ./inputs/input_dfm_gru_model_B_no_early_stopping.yaml\u001b[0m\n",
      "\u001b[92m2020-09-10 18:42:13\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32mpytorch will use: cuda:1\u001b[0m\n",
      "\u001b[92m2020-09-10 18:42:13\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mread CSV file: ./dataset/ocr_trainval.txt\u001b[0m\n",
      "\u001b[92m2020-09-10 18:42:14\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32mnumber of labels, True: 42301 and False: 42302\u001b[0m\n",
      "\u001b[92m2020-09-10 18:42:14\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mSplitting the Dataset\u001b[0m\n",
      "\u001b[92m2020-09-10 18:42:14\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mfinish splitting the Dataset. User time: 0.0485692024230957\u001b[0m\n",
      "\u001b[92m2020-09-10 18:42:14\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msplits are as follow:\n",
      "train    64000\n",
      "val      20603\n",
      "Name: split, dtype: int64\u001b[0m\n",
      "\u001b[92m2020-09-10 18:42:14\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mstart creating a lookup table and convert characters to indices\u001b[0m\n",
      "\u001b[92m2020-09-10 18:42:14\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32m-- create vocabulary\u001b[0m\n",
      "\u001b[92m2020-09-10 18:42:15\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32m-- convert tokens to indices\u001b[0m\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "length s2:   0%|          | 0/64000 [00:00<?, ?it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:42:16\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mskipping 0 lines\u001b[0m\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "                                                                     \r"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n",
      "============================================================\n",
      "List all parameters in the model\n",
      "============================================================\n",
      "emb.weight False\n",
      "rnn_1.weight_ih_l0 True\n",
      "rnn_1.weight_hh_l0 True\n",
      "rnn_1.bias_ih_l0 True\n",
      "rnn_1.bias_hh_l0 True\n",
      "rnn_1.weight_ih_l0_reverse True\n",
      "rnn_1.weight_hh_l0_reverse True\n",
      "rnn_1.bias_ih_l0_reverse True\n",
      "rnn_1.bias_hh_l0_reverse True\n",
      "rnn_1.weight_ih_l1 True\n",
      "rnn_1.weight_hh_l1 True\n",
      "rnn_1.bias_ih_l1 True\n",
      "rnn_1.bias_hh_l1 True\n",
      "rnn_1.weight_ih_l1_reverse True\n",
      "rnn_1.weight_hh_l1_reverse True\n",
      "rnn_1.bias_ih_l1_reverse True\n",
      "rnn_1.bias_hh_l1_reverse True\n",
      "attn_step1.weight True\n",
      "attn_step1.bias True\n",
      "attn_step2.weight True\n",
      "attn_step2.bias True\n",
      "fc1.weight True\n",
      "fc1.bias True\n",
      "fc2.weight True\n",
      "fc2.bias True\n",
      "============================================================\n",
      "\n",
      "\n",
      "\n",
      "\u001b[92m2020-09-10 18:42:17\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m******************************\u001b[0m\n",
      "\u001b[92m2020-09-10 18:42:17\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m**** (Bi-directional) GRU ****\u001b[0m\n",
      "\u001b[92m2020-09-10 18:42:17\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m******************************\u001b[0m\n",
      "\u001b[92m2020-09-10 18:42:17\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mNumber of batches: 1000\u001b[0m\n",
      "\u001b[92m2020-09-10 18:42:17\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mNumber of epochs: 10\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "1c7eabaf4c8a4ae5b552e20824035b27",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=10.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=1000.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n",
      "\n",
      "====================\n",
      "Total number of params: 684843\n",
      "\n",
      "two_parallel_rnns (\n",
      "  (emb): Embedding(7542, 60), weights=((7542, 60),), parameters=452520\n",
      "  (rnn_1): GRU(60, 60, num_layers=2, dropout=0.01, bidirectional=True), weights=((180, 60), (180, 60), (180,), (180,), (180, 60), (180, 60), (180,), (180,), (180, 120), (180, 60), (180,), (180,), (180, 120), (180, 60), (180,), (180,)), parameters=109440\n",
      "  (attn_step1): Linear(in_features=120, out_features=60, bias=True), weights=((60, 120), (60,)), parameters=7260\n",
      "  (attn_step2): Linear(in_features=60, out_features=1, bias=True), weights=((1, 60), (1,)), parameters=61\n",
      "  (fc1): Linear(in_features=960, out_features=120, bias=True), weights=((120, 960), (120,)), parameters=115320\n",
      "  (fc2): Linear(in_features=120, out_features=2, bias=True), weights=((2, 120), (2,)), parameters=242\n",
      ")\n",
      "====================\n",
      "\n",
      "\n",
      "\u001b[92m2020-09-10 18:43:14\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_18:43:14 -- Epoch: 1/10; Train; loss: 0.306; acc: 0.870; precision: 0.862, recall: 0.882, macrof1: 0.870, weightedf1: 0.870\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=322.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:43:20\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_18:43:20 -- Epoch: 1/10; Valid; loss: 0.183; acc: 0.930; precision: 0.932, recall: 0.927, macrof1: 0.930, weightedf1: 0.930\u001b[0m\n",
      "\u001b[92m2020-09-10 18:43:20\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=1000.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:44:17\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_18:44:17 -- Epoch: 2/10; Train; loss: 0.134; acc: 0.951; precision: 0.944, recall: 0.959, macrof1: 0.951, weightedf1: 0.951\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=322.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:44:24\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_18:44:24 -- Epoch: 2/10; Valid; loss: 0.142; acc: 0.949; precision: 0.938, recall: 0.961, macrof1: 0.949, weightedf1: 0.949\u001b[0m\n",
      "\u001b[92m2020-09-10 18:44:24\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=1000.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:45:20\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_18:45:20 -- Epoch: 3/10; Train; loss: 0.086; acc: 0.971; precision: 0.966, recall: 0.976, macrof1: 0.971, weightedf1: 0.971\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=322.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:45:28\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_18:45:28 -- Epoch: 3/10; Valid; loss: 0.143; acc: 0.949; precision: 0.961, recall: 0.935, macrof1: 0.949, weightedf1: 0.949\u001b[0m\n",
      "\u001b[92m2020-09-10 18:45:28\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=1000.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:46:58\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_18:46:58 -- Epoch: 4/10; Train; loss: 0.059; acc: 0.981; precision: 0.978, recall: 0.984, macrof1: 0.981, weightedf1: 0.981\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=322.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:47:09\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_18:47:09 -- Epoch: 4/10; Valid; loss: 0.141; acc: 0.952; precision: 0.952, recall: 0.952, macrof1: 0.952, weightedf1: 0.952\u001b[0m\n",
      "\u001b[92m2020-09-10 18:47:09\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=1000.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:48:44\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_18:48:44 -- Epoch: 5/10; Train; loss: 0.042; acc: 0.987; precision: 0.985, recall: 0.988, macrof1: 0.987, weightedf1: 0.987\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=322.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:48:55\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_18:48:55 -- Epoch: 5/10; Valid; loss: 0.162; acc: 0.951; precision: 0.935, recall: 0.969, macrof1: 0.951, weightedf1: 0.951\u001b[0m\n",
      "\u001b[92m2020-09-10 18:48:55\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=1000.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:50:30\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_18:50:30 -- Epoch: 6/10; Train; loss: 0.032; acc: 0.990; precision: 0.988, recall: 0.992, macrof1: 0.990, weightedf1: 0.990\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=322.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:50:42\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_18:50:42 -- Epoch: 6/10; Valid; loss: 0.162; acc: 0.953; precision: 0.955, recall: 0.950, macrof1: 0.953, weightedf1: 0.953\u001b[0m\n",
      "\u001b[92m2020-09-10 18:50:42\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=1000.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:52:19\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_18:52:19 -- Epoch: 7/10; Train; loss: 0.028; acc: 0.991; precision: 0.990, recall: 0.992, macrof1: 0.991, weightedf1: 0.991\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=322.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:52:28\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_18:52:28 -- Epoch: 7/10; Valid; loss: 0.175; acc: 0.953; precision: 0.945, recall: 0.961, macrof1: 0.953, weightedf1: 0.953\u001b[0m\n",
      "\u001b[92m2020-09-10 18:52:28\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=1000.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:54:05\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_18:54:05 -- Epoch: 8/10; Train; loss: 0.027; acc: 0.992; precision: 0.991, recall: 0.993, macrof1: 0.992, weightedf1: 0.992\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=322.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:54:17\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_18:54:17 -- Epoch: 8/10; Valid; loss: 0.174; acc: 0.953; precision: 0.956, recall: 0.950, macrof1: 0.953, weightedf1: 0.953\u001b[0m\n",
      "\u001b[92m2020-09-10 18:54:17\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=1000.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:55:51\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_18:55:51 -- Epoch: 9/10; Train; loss: 0.021; acc: 0.993; precision: 0.992, recall: 0.994, macrof1: 0.993, weightedf1: 0.993\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=322.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:56:03\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_18:56:03 -- Epoch: 9/10; Valid; loss: 0.182; acc: 0.954; precision: 0.942, recall: 0.967, macrof1: 0.954, weightedf1: 0.954\u001b[0m\n",
      "\u001b[92m2020-09-10 18:56:03\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=1000.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:57:26\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_18:57:26 -- Epoch: 10/10; Train; loss: 0.022; acc: 0.993; precision: 0.992, recall: 0.994, macrof1: 0.993, weightedf1: 0.993\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=322.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:57:36\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_18:57:36 -- Epoch: 10/10; Valid; loss: 0.199; acc: 0.952; precision: 0.938, recall: 0.969, macrof1: 0.952, weightedf1: 0.952\u001b[0m\n",
      "\u001b[92m2020-09-10 18:57:36\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n",
      "\n",
      "\u001b[92m2020-09-10 18:57:36\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model with least valid loss (checkpoint: 4) at ./models/wikigaz_en_ft_ocr_gru_v002_n64000/wikigaz_en_ft_ocr_gru_v002_n64000.model\u001b[0m\n",
      "\n",
      "\n",
      "\n",
      "====================\n",
      "User time: 919.6673\n",
      "====================\n"
     ]
    }
   ],
   "source": [
    "from DeezyMatch import finetune as dm_finetune\n",
    "\n",
    "n_ft_examples = 64000\n",
    "\n",
    "# fine-tune a pretrained model stored at pretrained_model_path and pretrained_vocab_path \n",
    "dm_finetune(input_file_path=\"./inputs/input_dfm_gru_model_B_no_early_stopping.yaml\", \n",
    "            dataset_path=\"./dataset/ocr_trainval.txt\", \n",
    "            model_name=f\"wikigaz_en_ft_ocr_gru_v002_n{n_ft_examples}\",\n",
    "            pretrained_model_path=\"./models/wikigaz_en_gru_001/wikigaz_en_gru_001.model\", \n",
    "            pretrained_vocab_path=\"./models/wikigaz_en_gru_001/wikigaz_en_gru_001.vocab\",\n",
    "            n_train_examples=n_ft_examples\n",
    "           )"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 54,
   "metadata": {
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:57:36\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mread input file: ./inputs/input_dfm_gru_model_B_no_early_stopping.yaml\u001b[0m\n",
      "\u001b[92m2020-09-10 18:57:36\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32mpytorch will use: cuda:1\u001b[0m\n",
      "\u001b[92m2020-09-10 18:57:36\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mread CSV file: ./dataset/ocr_trainval.txt\u001b[0m\n",
      "\u001b[92m2020-09-10 18:57:37\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32mnumber of labels, True: 42301 and False: 42302\u001b[0m\n",
      "\u001b[92m2020-09-10 18:57:37\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mSplitting the Dataset\u001b[0m\n",
      "\u001b[92m2020-09-10 18:57:37\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mfinish splitting the Dataset. User time: 0.05806612968444824\u001b[0m\n",
      "\u001b[92m2020-09-10 18:57:37\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msplits are as follow:\n",
      "train    84000\n",
      "val        603\n",
      "Name: split, dtype: int64\u001b[0m\n",
      "\u001b[92m2020-09-10 18:57:37\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mstart creating a lookup table and convert characters to indices\u001b[0m\n",
      "\u001b[92m2020-09-10 18:57:37\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32m-- create vocabulary\u001b[0m\n",
      "\u001b[92m2020-09-10 18:57:38\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32m-- convert tokens to indices\u001b[0m\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      "length s1:   0%|          | 0/84000 [00:00<?, ?it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:57:39\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mskipping 0 lines\u001b[0m\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "                                                                     \r"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n",
      "============================================================\n",
      "List all parameters in the model\n",
      "============================================================\n",
      "emb.weight False\n",
      "rnn_1.weight_ih_l0 True\n",
      "rnn_1.weight_hh_l0 True\n",
      "rnn_1.bias_ih_l0 True\n",
      "rnn_1.bias_hh_l0 True\n",
      "rnn_1.weight_ih_l0_reverse True\n",
      "rnn_1.weight_hh_l0_reverse True\n",
      "rnn_1.bias_ih_l0_reverse True\n",
      "rnn_1.bias_hh_l0_reverse True\n",
      "rnn_1.weight_ih_l1 True\n",
      "rnn_1.weight_hh_l1 True\n",
      "rnn_1.bias_ih_l1 True\n",
      "rnn_1.bias_hh_l1 True\n",
      "rnn_1.weight_ih_l1_reverse True\n",
      "rnn_1.weight_hh_l1_reverse True\n",
      "rnn_1.bias_ih_l1_reverse True\n",
      "rnn_1.bias_hh_l1_reverse True\n",
      "attn_step1.weight True\n",
      "attn_step1.bias True\n",
      "attn_step2.weight True\n",
      "attn_step2.bias True\n",
      "fc1.weight True\n",
      "fc1.bias True\n",
      "fc2.weight True\n",
      "fc2.bias True\n",
      "============================================================\n",
      "\n",
      "\n",
      "\n",
      "\u001b[92m2020-09-10 18:57:40\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m******************************\u001b[0m\n",
      "\u001b[92m2020-09-10 18:57:40\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m**** (Bi-directional) GRU ****\u001b[0m\n",
      "\u001b[92m2020-09-10 18:57:40\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m******************************\u001b[0m\n",
      "\u001b[92m2020-09-10 18:57:40\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mNumber of batches: 1313\u001b[0m\n",
      "\u001b[92m2020-09-10 18:57:40\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mNumber of epochs: 10\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "a1ebb689d33b4a559c8a4e154702ce8c",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=10.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=1313.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n",
      "\n",
      "====================\n",
      "Total number of params: 684843\n",
      "\n",
      "two_parallel_rnns (\n",
      "  (emb): Embedding(7542, 60), weights=((7542, 60),), parameters=452520\n",
      "  (rnn_1): GRU(60, 60, num_layers=2, dropout=0.01, bidirectional=True), weights=((180, 60), (180, 60), (180,), (180,), (180, 60), (180, 60), (180,), (180,), (180, 120), (180, 60), (180,), (180,), (180, 120), (180, 60), (180,), (180,)), parameters=109440\n",
      "  (attn_step1): Linear(in_features=120, out_features=60, bias=True), weights=((60, 120), (60,)), parameters=7260\n",
      "  (attn_step2): Linear(in_features=60, out_features=1, bias=True), weights=((1, 60), (1,)), parameters=61\n",
      "  (fc1): Linear(in_features=960, out_features=120, bias=True), weights=((120, 960), (120,)), parameters=115320\n",
      "  (fc2): Linear(in_features=120, out_features=2, bias=True), weights=((2, 120), (2,)), parameters=242\n",
      ")\n",
      "====================\n",
      "\n",
      "\n",
      "\u001b[92m2020-09-10 18:59:42\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_18:59:42 -- Epoch: 1/10; Train; loss: 0.272; acc: 0.887; precision: 0.880, recall: 0.896, macrof1: 0.887, weightedf1: 0.887\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=10.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 18:59:42\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_18:59:42 -- Epoch: 1/10; Valid; loss: 0.144; acc: 0.947; precision: 0.932, recall: 0.963, macrof1: 0.947, weightedf1: 0.947\u001b[0m\n",
      "\u001b[92m2020-09-10 18:59:42\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=1313.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:01:44\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_19:01:44 -- Epoch: 2/10; Train; loss: 0.121; acc: 0.957; precision: 0.950, recall: 0.965, macrof1: 0.957, weightedf1: 0.957\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=10.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:01:44\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_19:01:44 -- Epoch: 2/10; Valid; loss: 0.098; acc: 0.960; precision: 0.957, recall: 0.963, macrof1: 0.960, weightedf1: 0.960\u001b[0m\n",
      "\u001b[92m2020-09-10 19:01:44\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=1313.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:03:43\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_19:03:43 -- Epoch: 3/10; Train; loss: 0.081; acc: 0.972; precision: 0.968, recall: 0.977, macrof1: 0.972, weightedf1: 0.972\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=10.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:03:44\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_19:03:44 -- Epoch: 3/10; Valid; loss: 0.106; acc: 0.959; precision: 0.939, recall: 0.980, macrof1: 0.959, weightedf1: 0.959\u001b[0m\n",
      "\u001b[92m2020-09-10 19:03:44\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=1313.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:05:45\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_19:05:45 -- Epoch: 4/10; Train; loss: 0.058; acc: 0.981; precision: 0.978, recall: 0.985, macrof1: 0.981, weightedf1: 0.981\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=10.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:05:46\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_19:05:46 -- Epoch: 4/10; Valid; loss: 0.104; acc: 0.960; precision: 0.945, recall: 0.977, macrof1: 0.960, weightedf1: 0.960\u001b[0m\n",
      "\u001b[92m2020-09-10 19:05:46\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=1313.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:07:16\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_19:07:16 -- Epoch: 5/10; Train; loss: 0.042; acc: 0.987; precision: 0.984, recall: 0.989, macrof1: 0.987, weightedf1: 0.987\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=10.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:07:16\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_19:07:16 -- Epoch: 5/10; Valid; loss: 0.097; acc: 0.972; precision: 0.958, recall: 0.987, macrof1: 0.972, weightedf1: 0.972\u001b[0m\n",
      "\u001b[92m2020-09-10 19:07:16\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=1313.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:08:31\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_19:08:31 -- Epoch: 6/10; Train; loss: 0.036; acc: 0.989; precision: 0.987, recall: 0.990, macrof1: 0.989, weightedf1: 0.989\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=10.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:08:31\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_19:08:31 -- Epoch: 6/10; Valid; loss: 0.101; acc: 0.967; precision: 0.946, recall: 0.990, macrof1: 0.967, weightedf1: 0.967\u001b[0m\n",
      "\u001b[92m2020-09-10 19:08:31\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=1313.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:09:47\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_19:09:47 -- Epoch: 7/10; Train; loss: 0.031; acc: 0.990; precision: 0.989, recall: 0.991, macrof1: 0.990, weightedf1: 0.990\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=10.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:09:47\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_19:09:47 -- Epoch: 7/10; Valid; loss: 0.094; acc: 0.964; precision: 0.957, recall: 0.970, macrof1: 0.964, weightedf1: 0.964\u001b[0m\n",
      "\u001b[92m2020-09-10 19:09:47\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=1313.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:11:08\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_19:11:08 -- Epoch: 8/10; Train; loss: 0.026; acc: 0.992; precision: 0.991, recall: 0.993, macrof1: 0.992, weightedf1: 0.992\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=10.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:11:09\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_19:11:09 -- Epoch: 8/10; Valid; loss: 0.105; acc: 0.967; precision: 0.958, recall: 0.977, macrof1: 0.967, weightedf1: 0.967\u001b[0m\n",
      "\u001b[92m2020-09-10 19:11:09\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=1313.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:12:31\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_19:12:31 -- Epoch: 9/10; Train; loss: 0.024; acc: 0.993; precision: 0.991, recall: 0.994, macrof1: 0.992, weightedf1: 0.992\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=10.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:12:31\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_19:12:31 -- Epoch: 9/10; Valid; loss: 0.093; acc: 0.973; precision: 0.961, recall: 0.987, macrof1: 0.973, weightedf1: 0.973\u001b[0m\n",
      "\u001b[92m2020-09-10 19:12:31\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=1313.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:13:46\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_19:13:46 -- Epoch: 10/10; Train; loss: 0.024; acc: 0.993; precision: 0.991, recall: 0.994, macrof1: 0.993, weightedf1: 0.993\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=10.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:13:47\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_19:13:47 -- Epoch: 10/10; Valid; loss: 0.108; acc: 0.970; precision: 0.955, recall: 0.987, macrof1: 0.970, weightedf1: 0.970\u001b[0m\n",
      "\u001b[92m2020-09-10 19:13:47\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n",
      "\n",
      "\u001b[92m2020-09-10 19:13:47\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model with least valid loss (checkpoint: 9) at ./models/wikigaz_en_ft_ocr_gru_v002_n84000/wikigaz_en_ft_ocr_gru_v002_n84000.model\u001b[0m\n",
      "\n",
      "\n",
      "\n",
      "====================\n",
      "User time: 966.3750\n",
      "====================\n"
     ]
    }
   ],
   "source": [
    "from DeezyMatch import finetune as dm_finetune\n",
    "\n",
    "n_ft_examples = 84000\n",
    "\n",
    "# fine-tune a pretrained model stored at pretrained_model_path and pretrained_vocab_path \n",
    "dm_finetune(input_file_path=\"./inputs/input_dfm_gru_model_B_no_early_stopping.yaml\", \n",
    "            dataset_path=\"./dataset/ocr_trainval.txt\", \n",
    "            model_name=f\"wikigaz_en_ft_ocr_gru_v002_n{n_ft_examples}\",\n",
    "            pretrained_model_path=\"./models/wikigaz_en_gru_001/wikigaz_en_gru_001.model\", \n",
    "            pretrained_vocab_path=\"./models/wikigaz_en_gru_001/wikigaz_en_gru_001.vocab\",\n",
    "            n_train_examples=n_ft_examples\n",
    "           )"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Fine-Tune, model B, LSTM"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 55,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:13:47\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mread input file: ./inputs/input_dfm_lstm_model_B.yaml\u001b[0m\n",
      "\u001b[92m2020-09-10 19:13:47\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32mpytorch will use: cuda:1\u001b[0m\n",
      "\u001b[92m2020-09-10 19:13:47\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mread CSV file: ./dataset/ocr_trainval.txt\u001b[0m\n",
      "\u001b[92m2020-09-10 19:13:47\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32mnumber of labels, True: 42301 and False: 42302\u001b[0m\n",
      "\u001b[92m2020-09-10 19:13:47\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mSplitting the Dataset\u001b[0m\n",
      "\u001b[92m2020-09-10 19:13:47\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mfinish splitting the Dataset. User time: 0.05909252166748047\u001b[0m\n",
      "\u001b[92m2020-09-10 19:13:47\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msplits are as follow:\n",
      "not_assigned    58971\n",
      "val             25380\n",
      "train             250\n",
      "test                2\n",
      "Name: split, dtype: int64\u001b[0m\n",
      "\u001b[92m2020-09-10 19:13:47\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mstart creating a lookup table and convert characters to indices\u001b[0m\n",
      "\u001b[92m2020-09-10 19:13:47\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32m-- create vocabulary\u001b[0m\n",
      "\u001b[92m2020-09-10 19:13:49\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32m-- convert tokens to indices\u001b[0m\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "s1 padding:   0%|          | 0/25380 [00:00<?, ?it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:13:49\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mskipping 0 lines\u001b[0m\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "                                                                     \r"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n",
      "============================================================\n",
      "List all parameters in the model\n",
      "============================================================\n",
      "emb.weight False\n",
      "rnn_1.weight_ih_l0 True\n",
      "rnn_1.weight_hh_l0 True\n",
      "rnn_1.bias_ih_l0 True\n",
      "rnn_1.bias_hh_l0 True\n",
      "rnn_1.weight_ih_l0_reverse True\n",
      "rnn_1.weight_hh_l0_reverse True\n",
      "rnn_1.bias_ih_l0_reverse True\n",
      "rnn_1.bias_hh_l0_reverse True\n",
      "rnn_1.weight_ih_l1 True\n",
      "rnn_1.weight_hh_l1 True\n",
      "rnn_1.bias_ih_l1 True\n",
      "rnn_1.bias_hh_l1 True\n",
      "rnn_1.weight_ih_l1_reverse True\n",
      "rnn_1.weight_hh_l1_reverse True\n",
      "rnn_1.bias_ih_l1_reverse True\n",
      "rnn_1.bias_hh_l1_reverse True\n",
      "attn_step1.weight True\n",
      "attn_step1.bias True\n",
      "attn_step2.weight True\n",
      "attn_step2.bias True\n",
      "fc1.weight True\n",
      "fc1.bias True\n",
      "fc2.weight True\n",
      "fc2.bias True\n",
      "============================================================\n",
      "\n",
      "\n",
      "\n",
      "\u001b[92m2020-09-10 19:13:50\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m******************************\u001b[0m\n",
      "\u001b[92m2020-09-10 19:13:50\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m**** (Bi-directional) LSTM ****\u001b[0m\n",
      "\u001b[92m2020-09-10 19:13:50\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m******************************\u001b[0m\n",
      "\u001b[92m2020-09-10 19:13:50\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mNumber of batches: 4\u001b[0m\n",
      "\u001b[92m2020-09-10 19:13:50\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mNumber of epochs: 20\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "fe91795984784c32866ee16c879be4ee",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=20.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=4.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n",
      "\n",
      "====================\n",
      "Total number of params: 721323\n",
      "\n",
      "two_parallel_rnns (\n",
      "  (emb): Embedding(7542, 60), weights=((7542, 60),), parameters=452520\n",
      "  (rnn_1): LSTM(60, 60, num_layers=2, dropout=0.01, bidirectional=True), weights=((240, 60), (240, 60), (240,), (240,), (240, 60), (240, 60), (240,), (240,), (240, 120), (240, 60), (240,), (240,), (240, 120), (240, 60), (240,), (240,)), parameters=145920\n",
      "  (attn_step1): Linear(in_features=120, out_features=60, bias=True), weights=((60, 120), (60,)), parameters=7260\n",
      "  (attn_step2): Linear(in_features=60, out_features=1, bias=True), weights=((1, 60), (1,)), parameters=61\n",
      "  (fc1): Linear(in_features=960, out_features=120, bias=True), weights=((120, 960), (120,)), parameters=115320\n",
      "  (fc2): Linear(in_features=120, out_features=2, bias=True), weights=((2, 120), (2,)), parameters=242\n",
      ")\n",
      "====================\n",
      "\n",
      "\n",
      "\u001b[92m2020-09-10 19:13:50\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_19:13:50 -- Epoch: 1/20; Train; loss: 1.577; acc: 0.448; precision: 0.446, recall: 0.432, macrof1: 0.448, weightedf1: 0.448\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:13:59\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_19:13:59 -- Epoch: 1/20; Valid; loss: 1.609; acc: 0.480; precision: 0.482, recall: 0.519, macrof1: 0.479, weightedf1: 0.479\u001b[0m\n",
      "\u001b[92m2020-09-10 19:13:59\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=4.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:13:59\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_19:13:59 -- Epoch: 2/20; Train; loss: 0.977; acc: 0.580; precision: 0.588, recall: 0.536, macrof1: 0.579, weightedf1: 0.579\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:14:08\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_19:14:08 -- Epoch: 2/20; Valid; loss: 1.437; acc: 0.505; precision: 0.505, recall: 0.541, macrof1: 0.505, weightedf1: 0.505\u001b[0m\n",
      "\u001b[92m2020-09-10 19:14:08\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=4.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:14:08\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_19:14:08 -- Epoch: 3/20; Train; loss: 0.644; acc: 0.724; precision: 0.733, recall: 0.704, macrof1: 0.724, weightedf1: 0.724\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:14:17\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_19:14:17 -- Epoch: 3/20; Valid; loss: 1.318; acc: 0.525; precision: 0.523, recall: 0.583, macrof1: 0.524, weightedf1: 0.524\u001b[0m\n",
      "\u001b[92m2020-09-10 19:14:17\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=4.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:14:17\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_19:14:17 -- Epoch: 4/20; Train; loss: 0.438; acc: 0.816; precision: 0.826, recall: 0.800, macrof1: 0.816, weightedf1: 0.816\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:14:26\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_19:14:26 -- Epoch: 4/20; Valid; loss: 1.233; acc: 0.542; precision: 0.536, recall: 0.620, macrof1: 0.539, weightedf1: 0.539\u001b[0m\n",
      "\u001b[92m2020-09-10 19:14:26\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=4.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:14:27\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_19:14:27 -- Epoch: 5/20; Train; loss: 0.319; acc: 0.876; precision: 0.856, recall: 0.904, macrof1: 0.876, weightedf1: 0.876\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:14:36\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_19:14:36 -- Epoch: 5/20; Valid; loss: 1.177; acc: 0.554; precision: 0.545, recall: 0.650, macrof1: 0.550, weightedf1: 0.550\u001b[0m\n",
      "\u001b[92m2020-09-10 19:14:36\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=4.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:14:36\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_19:14:36 -- Epoch: 6/20; Train; loss: 0.237; acc: 0.912; precision: 0.887, recall: 0.944, macrof1: 0.912, weightedf1: 0.912\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:14:45\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_19:14:45 -- Epoch: 6/20; Valid; loss: 1.139; acc: 0.566; precision: 0.554, recall: 0.674, macrof1: 0.561, weightedf1: 0.561\u001b[0m\n",
      "\u001b[92m2020-09-10 19:14:45\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=4.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:14:45\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_19:14:45 -- Epoch: 7/20; Train; loss: 0.179; acc: 0.940; precision: 0.923, recall: 0.960, macrof1: 0.940, weightedf1: 0.940\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:14:54\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_19:14:54 -- Epoch: 7/20; Valid; loss: 1.113; acc: 0.574; precision: 0.561, recall: 0.683, macrof1: 0.569, weightedf1: 0.569\u001b[0m\n",
      "\u001b[92m2020-09-10 19:14:54\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=4.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:14:54\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_19:14:54 -- Epoch: 8/20; Train; loss: 0.145; acc: 0.972; precision: 0.961, recall: 0.984, macrof1: 0.972, weightedf1: 0.972\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:15:03\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_19:15:03 -- Epoch: 8/20; Valid; loss: 1.093; acc: 0.580; precision: 0.566, recall: 0.683, macrof1: 0.576, weightedf1: 0.576\u001b[0m\n",
      "\u001b[92m2020-09-10 19:15:03\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=4.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:15:03\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_19:15:03 -- Epoch: 9/20; Train; loss: 0.115; acc: 0.988; precision: 0.984, recall: 0.992, macrof1: 0.988, weightedf1: 0.988\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:15:12\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_19:15:12 -- Epoch: 9/20; Valid; loss: 1.080; acc: 0.587; precision: 0.573, recall: 0.680, macrof1: 0.583, weightedf1: 0.583\u001b[0m\n",
      "\u001b[92m2020-09-10 19:15:12\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=4.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:15:13\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_19:15:13 -- Epoch: 10/20; Train; loss: 0.095; acc: 0.992; precision: 0.992, recall: 0.992, macrof1: 0.992, weightedf1: 0.992\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:15:22\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_19:15:22 -- Epoch: 10/20; Valid; loss: 1.073; acc: 0.592; precision: 0.579, recall: 0.675, macrof1: 0.589, weightedf1: 0.589\u001b[0m\n",
      "\u001b[92m2020-09-10 19:15:22\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=4.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:15:22\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_19:15:22 -- Epoch: 11/20; Train; loss: 0.078; acc: 0.996; precision: 0.992, recall: 1.000, macrof1: 0.996, weightedf1: 0.996\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:15:31\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_19:15:31 -- Epoch: 11/20; Valid; loss: 1.073; acc: 0.596; precision: 0.583, recall: 0.676, macrof1: 0.593, weightedf1: 0.593\u001b[0m\n",
      "\u001b[92m2020-09-10 19:15:31\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=4.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:15:31\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_19:15:31 -- Epoch: 12/20; Train; loss: 0.066; acc: 0.996; precision: 0.992, recall: 1.000, macrof1: 0.996, weightedf1: 0.996\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:15:40\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_19:15:40 -- Epoch: 12/20; Valid; loss: 1.076; acc: 0.599; precision: 0.585, recall: 0.678, macrof1: 0.596, weightedf1: 0.596\u001b[0m\n",
      "\u001b[92m2020-09-10 19:15:40\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model (early stopped) with least valid loss (checkpoint: 11) at ./models/wikigaz_en_ft_ocr_lstm_v002_n250/wikigaz_en_ft_ocr_lstm_v002_n250.model\u001b[0m\n",
      "\u001b[92m2020-09-10 19:15:40\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n",
      "\u001b[92m2020-09-10 19:15:40\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mEarly stopping at epoch: 12, selected epoch: 11\u001b[0m\n",
      "\n",
      "\n",
      "\n",
      "\n",
      "====================\n",
      "User time: 110.5851\n",
      "====================\n"
     ]
    }
   ],
   "source": [
    "from DeezyMatch import finetune as dm_finetune\n",
    "\n",
    "n_ft_examples = 250\n",
    "\n",
    "# fine-tune a pretrained model stored at pretrained_model_path and pretrained_vocab_path \n",
    "dm_finetune(input_file_path=\"./inputs/input_dfm_lstm_model_B.yaml\", \n",
    "            dataset_path=\"./dataset/ocr_trainval.txt\", \n",
    "            model_name=f\"wikigaz_en_ft_ocr_lstm_v002_n{n_ft_examples}\",\n",
    "            pretrained_model_path=\"./models/wikigaz_en_lstm_001/wikigaz_en_lstm_001.model\", \n",
    "            pretrained_vocab_path=\"./models/wikigaz_en_lstm_001/wikigaz_en_lstm_001.vocab\",\n",
    "            n_train_examples=n_ft_examples\n",
    "           )"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 56,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:15:40\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mread input file: ./inputs/input_dfm_lstm_model_B.yaml\u001b[0m\n",
      "\u001b[92m2020-09-10 19:15:40\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32mpytorch will use: cuda:1\u001b[0m\n",
      "\u001b[92m2020-09-10 19:15:40\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mread CSV file: ./dataset/ocr_trainval.txt\u001b[0m\n",
      "\u001b[92m2020-09-10 19:15:41\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32mnumber of labels, True: 42301 and False: 42302\u001b[0m\n",
      "\u001b[92m2020-09-10 19:15:41\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mSplitting the Dataset\u001b[0m\n",
      "\u001b[92m2020-09-10 19:15:41\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mfinish splitting the Dataset. User time: 0.056529998779296875\u001b[0m\n",
      "\u001b[92m2020-09-10 19:15:41\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msplits are as follow:\n",
      "not_assigned    58721\n",
      "val             25380\n",
      "train             500\n",
      "test                2\n",
      "Name: split, dtype: int64\u001b[0m\n",
      "\u001b[92m2020-09-10 19:15:41\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mstart creating a lookup table and convert characters to indices\u001b[0m\n",
      "\u001b[92m2020-09-10 19:15:41\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32m-- create vocabulary\u001b[0m\n",
      "\u001b[92m2020-09-10 19:15:42\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32m-- convert tokens to indices\u001b[0m\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "s1 padding:   0%|          | 0/25380 [00:00<?, ?it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:15:43\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mskipping 0 lines\u001b[0m\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "                                                                     \r"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n",
      "============================================================\n",
      "List all parameters in the model\n",
      "============================================================\n",
      "emb.weight False\n",
      "rnn_1.weight_ih_l0 True\n",
      "rnn_1.weight_hh_l0 True\n",
      "rnn_1.bias_ih_l0 True\n",
      "rnn_1.bias_hh_l0 True\n",
      "rnn_1.weight_ih_l0_reverse True\n",
      "rnn_1.weight_hh_l0_reverse True\n",
      "rnn_1.bias_ih_l0_reverse True\n",
      "rnn_1.bias_hh_l0_reverse True\n",
      "rnn_1.weight_ih_l1 True\n",
      "rnn_1.weight_hh_l1 True\n",
      "rnn_1.bias_ih_l1 True\n",
      "rnn_1.bias_hh_l1 True\n",
      "rnn_1.weight_ih_l1_reverse True\n",
      "rnn_1.weight_hh_l1_reverse True\n",
      "rnn_1.bias_ih_l1_reverse True\n",
      "rnn_1.bias_hh_l1_reverse True\n",
      "attn_step1.weight True\n",
      "attn_step1.bias True\n",
      "attn_step2.weight True\n",
      "attn_step2.bias True\n",
      "fc1.weight True\n",
      "fc1.bias True\n",
      "fc2.weight True\n",
      "fc2.bias True\n",
      "============================================================\n",
      "\n",
      "\n",
      "\n",
      "\u001b[92m2020-09-10 19:15:43\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m******************************\u001b[0m\n",
      "\u001b[92m2020-09-10 19:15:43\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m**** (Bi-directional) LSTM ****\u001b[0m\n",
      "\u001b[92m2020-09-10 19:15:43\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m******************************\u001b[0m\n",
      "\u001b[92m2020-09-10 19:15:43\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mNumber of batches: 8\u001b[0m\n",
      "\u001b[92m2020-09-10 19:15:43\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mNumber of epochs: 20\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "e00dc6c936064aecbf845c8d7e205654",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=20.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=8.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n",
      "\n",
      "====================\n",
      "Total number of params: 721323\n",
      "\n",
      "two_parallel_rnns (\n",
      "  (emb): Embedding(7542, 60), weights=((7542, 60),), parameters=452520\n",
      "  (rnn_1): LSTM(60, 60, num_layers=2, dropout=0.01, bidirectional=True), weights=((240, 60), (240, 60), (240,), (240,), (240, 60), (240, 60), (240,), (240,), (240, 120), (240, 60), (240,), (240,), (240, 120), (240, 60), (240,), (240,)), parameters=145920\n",
      "  (attn_step1): Linear(in_features=120, out_features=60, bias=True), weights=((60, 120), (60,)), parameters=7260\n",
      "  (attn_step2): Linear(in_features=60, out_features=1, bias=True), weights=((1, 60), (1,)), parameters=61\n",
      "  (fc1): Linear(in_features=960, out_features=120, bias=True), weights=((120, 960), (120,)), parameters=115320\n",
      "  (fc2): Linear(in_features=120, out_features=2, bias=True), weights=((2, 120), (2,)), parameters=242\n",
      ")\n",
      "====================\n",
      "\n",
      "\n",
      "\u001b[92m2020-09-10 19:15:44\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_19:15:44 -- Epoch: 1/20; Train; loss: 1.497; acc: 0.456; precision: 0.454, recall: 0.436, macrof1: 0.456, weightedf1: 0.456\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:15:53\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_19:15:53 -- Epoch: 1/20; Valid; loss: 1.416; acc: 0.508; precision: 0.507, recall: 0.543, macrof1: 0.507, weightedf1: 0.507\u001b[0m\n",
      "\u001b[92m2020-09-10 19:15:53\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=8.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:15:53\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_19:15:53 -- Epoch: 2/20; Train; loss: 0.832; acc: 0.638; precision: 0.636, recall: 0.644, macrof1: 0.638, weightedf1: 0.638\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:16:02\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_19:16:02 -- Epoch: 2/20; Valid; loss: 1.148; acc: 0.553; precision: 0.547, recall: 0.621, macrof1: 0.551, weightedf1: 0.551\u001b[0m\n",
      "\u001b[92m2020-09-10 19:16:02\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=8.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:16:03\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_19:16:03 -- Epoch: 3/20; Train; loss: 0.523; acc: 0.774; precision: 0.762, recall: 0.796, macrof1: 0.774, weightedf1: 0.774\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:16:11\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_19:16:11 -- Epoch: 3/20; Valid; loss: 1.008; acc: 0.585; precision: 0.573, recall: 0.666, macrof1: 0.582, weightedf1: 0.582\u001b[0m\n",
      "\u001b[92m2020-09-10 19:16:11\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=8.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:16:12\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_19:16:12 -- Epoch: 4/20; Train; loss: 0.390; acc: 0.844; precision: 0.826, recall: 0.872, macrof1: 0.844, weightedf1: 0.844\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:16:21\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_19:16:21 -- Epoch: 4/20; Valid; loss: 0.939; acc: 0.605; precision: 0.589, recall: 0.693, macrof1: 0.602, weightedf1: 0.602\u001b[0m\n",
      "\u001b[92m2020-09-10 19:16:21\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=8.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:16:21\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_19:16:21 -- Epoch: 5/20; Train; loss: 0.301; acc: 0.892; precision: 0.871, recall: 0.920, macrof1: 0.892, weightedf1: 0.892\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:16:30\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_19:16:30 -- Epoch: 5/20; Valid; loss: 0.895; acc: 0.618; precision: 0.604, recall: 0.687, macrof1: 0.617, weightedf1: 0.617\u001b[0m\n",
      "\u001b[92m2020-09-10 19:16:30\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=8.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:16:31\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_19:16:31 -- Epoch: 6/20; Train; loss: 0.231; acc: 0.934; precision: 0.922, recall: 0.948, macrof1: 0.934, weightedf1: 0.934\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:16:40\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_19:16:40 -- Epoch: 6/20; Valid; loss: 0.877; acc: 0.632; precision: 0.618, recall: 0.690, macrof1: 0.631, weightedf1: 0.631\u001b[0m\n",
      "\u001b[92m2020-09-10 19:16:40\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=8.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:16:40\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_19:16:40 -- Epoch: 7/20; Train; loss: 0.178; acc: 0.968; precision: 0.957, recall: 0.980, macrof1: 0.968, weightedf1: 0.968\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:16:49\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_19:16:49 -- Epoch: 7/20; Valid; loss: 0.875; acc: 0.642; precision: 0.627, recall: 0.704, macrof1: 0.641, weightedf1: 0.641\u001b[0m\n",
      "\u001b[92m2020-09-10 19:16:49\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=8.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:16:49\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_19:16:49 -- Epoch: 8/20; Train; loss: 0.142; acc: 0.976; precision: 0.969, recall: 0.984, macrof1: 0.976, weightedf1: 0.976\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:16:58\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_19:16:58 -- Epoch: 8/20; Valid; loss: 0.874; acc: 0.652; precision: 0.638, recall: 0.704, macrof1: 0.651, weightedf1: 0.651\u001b[0m\n",
      "\u001b[92m2020-09-10 19:16:58\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=8.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:16:59\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_19:16:59 -- Epoch: 9/20; Train; loss: 0.112; acc: 0.992; precision: 0.992, recall: 0.992, macrof1: 0.992, weightedf1: 0.992\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:17:08\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_19:17:08 -- Epoch: 9/20; Valid; loss: 0.878; acc: 0.659; precision: 0.643, recall: 0.715, macrof1: 0.658, weightedf1: 0.658\u001b[0m\n",
      "\u001b[92m2020-09-10 19:17:08\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model (early stopped) with least valid loss (checkpoint: 8) at ./models/wikigaz_en_ft_ocr_lstm_v002_n500/wikigaz_en_ft_ocr_lstm_v002_n500.model\u001b[0m\n",
      "\u001b[92m2020-09-10 19:17:08\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n",
      "\u001b[92m2020-09-10 19:17:08\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mEarly stopping at epoch: 9, selected epoch: 8\u001b[0m\n",
      "\n",
      "\n",
      "\n",
      "\n",
      "====================\n",
      "User time: 84.4243\n",
      "====================\n"
     ]
    }
   ],
   "source": [
    "from DeezyMatch import finetune as dm_finetune\n",
    "\n",
    "n_ft_examples = 500\n",
    "\n",
    "# fine-tune a pretrained model stored at pretrained_model_path and pretrained_vocab_path \n",
    "dm_finetune(input_file_path=\"./inputs/input_dfm_lstm_model_B.yaml\", \n",
    "            dataset_path=\"./dataset/ocr_trainval.txt\", \n",
    "            model_name=f\"wikigaz_en_ft_ocr_lstm_v002_n{n_ft_examples}\",\n",
    "            pretrained_model_path=\"./models/wikigaz_en_lstm_001/wikigaz_en_lstm_001.model\", \n",
    "            pretrained_vocab_path=\"./models/wikigaz_en_lstm_001/wikigaz_en_lstm_001.vocab\",\n",
    "            n_train_examples=n_ft_examples\n",
    "           )"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 57,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:17:08\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mread input file: ./inputs/input_dfm_lstm_model_B.yaml\u001b[0m\n",
      "\u001b[92m2020-09-10 19:17:08\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32mpytorch will use: cuda:1\u001b[0m\n",
      "\u001b[92m2020-09-10 19:17:08\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mread CSV file: ./dataset/ocr_trainval.txt\u001b[0m\n",
      "\u001b[92m2020-09-10 19:17:09\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32mnumber of labels, True: 42301 and False: 42302\u001b[0m\n",
      "\u001b[92m2020-09-10 19:17:09\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mSplitting the Dataset\u001b[0m\n",
      "\u001b[92m2020-09-10 19:17:09\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mfinish splitting the Dataset. User time: 0.060021400451660156\u001b[0m\n",
      "\u001b[92m2020-09-10 19:17:09\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msplits are as follow:\n",
      "not_assigned    58221\n",
      "val             25380\n",
      "train            1000\n",
      "test                2\n",
      "Name: split, dtype: int64\u001b[0m\n",
      "\u001b[92m2020-09-10 19:17:09\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mstart creating a lookup table and convert characters to indices\u001b[0m\n",
      "\u001b[92m2020-09-10 19:17:09\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32m-- create vocabulary\u001b[0m\n",
      "\u001b[92m2020-09-10 19:17:10\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32m-- convert tokens to indices\u001b[0m\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "s1 padding:   0%|          | 0/25380 [00:00<?, ?it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:17:10\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mskipping 0 lines\u001b[0m\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "                                                                     \r"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n",
      "============================================================\n",
      "List all parameters in the model\n",
      "============================================================\n",
      "emb.weight False\n",
      "rnn_1.weight_ih_l0 True\n",
      "rnn_1.weight_hh_l0 True\n",
      "rnn_1.bias_ih_l0 True\n",
      "rnn_1.bias_hh_l0 True\n",
      "rnn_1.weight_ih_l0_reverse True\n",
      "rnn_1.weight_hh_l0_reverse True\n",
      "rnn_1.bias_ih_l0_reverse True\n",
      "rnn_1.bias_hh_l0_reverse True\n",
      "rnn_1.weight_ih_l1 True\n",
      "rnn_1.weight_hh_l1 True\n",
      "rnn_1.bias_ih_l1 True\n",
      "rnn_1.bias_hh_l1 True\n",
      "rnn_1.weight_ih_l1_reverse True\n",
      "rnn_1.weight_hh_l1_reverse True\n",
      "rnn_1.bias_ih_l1_reverse True\n",
      "rnn_1.bias_hh_l1_reverse True\n",
      "attn_step1.weight True\n",
      "attn_step1.bias True\n",
      "attn_step2.weight True\n",
      "attn_step2.bias True\n",
      "fc1.weight True\n",
      "fc1.bias True\n",
      "fc2.weight True\n",
      "fc2.bias True\n",
      "============================================================\n",
      "\n",
      "\n",
      "\n",
      "\u001b[92m2020-09-10 19:17:11\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m******************************\u001b[0m\n",
      "\u001b[92m2020-09-10 19:17:11\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m**** (Bi-directional) LSTM ****\u001b[0m\n",
      "\u001b[92m2020-09-10 19:17:11\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m******************************\u001b[0m\n",
      "\u001b[92m2020-09-10 19:17:11\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mNumber of batches: 16\u001b[0m\n",
      "\u001b[92m2020-09-10 19:17:11\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mNumber of epochs: 20\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "3ea8e5980a5a4181bf8f87719c9ed905",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=20.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=16.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n",
      "\n",
      "====================\n",
      "Total number of params: 721323\n",
      "\n",
      "two_parallel_rnns (\n",
      "  (emb): Embedding(7542, 60), weights=((7542, 60),), parameters=452520\n",
      "  (rnn_1): LSTM(60, 60, num_layers=2, dropout=0.01, bidirectional=True), weights=((240, 60), (240, 60), (240,), (240,), (240, 60), (240, 60), (240,), (240,), (240, 120), (240, 60), (240,), (240,), (240, 120), (240, 60), (240,), (240,)), parameters=145920\n",
      "  (attn_step1): Linear(in_features=120, out_features=60, bias=True), weights=((60, 120), (60,)), parameters=7260\n",
      "  (attn_step2): Linear(in_features=60, out_features=1, bias=True), weights=((1, 60), (1,)), parameters=61\n",
      "  (fc1): Linear(in_features=960, out_features=120, bias=True), weights=((120, 960), (120,)), parameters=115320\n",
      "  (fc2): Linear(in_features=120, out_features=2, bias=True), weights=((2, 120), (2,)), parameters=242\n",
      ")\n",
      "====================\n",
      "\n",
      "\n",
      "\u001b[92m2020-09-10 19:17:12\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_19:17:12 -- Epoch: 1/20; Train; loss: 1.376; acc: 0.505; precision: 0.505, recall: 0.486, macrof1: 0.505, weightedf1: 0.505\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:17:21\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_19:17:21 -- Epoch: 1/20; Valid; loss: 1.094; acc: 0.557; precision: 0.554, recall: 0.584, macrof1: 0.556, weightedf1: 0.556\u001b[0m\n",
      "\u001b[92m2020-09-10 19:17:21\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=16.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:17:22\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_19:17:22 -- Epoch: 2/20; Train; loss: 0.672; acc: 0.710; precision: 0.692, recall: 0.756, macrof1: 0.709, weightedf1: 0.709\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:17:31\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_19:17:31 -- Epoch: 2/20; Valid; loss: 0.833; acc: 0.620; precision: 0.606, recall: 0.686, macrof1: 0.619, weightedf1: 0.619\u001b[0m\n",
      "\u001b[92m2020-09-10 19:17:31\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=16.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:17:32\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_19:17:32 -- Epoch: 3/20; Train; loss: 0.455; acc: 0.802; precision: 0.797, recall: 0.810, macrof1: 0.802, weightedf1: 0.802\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:17:41\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_19:17:41 -- Epoch: 3/20; Valid; loss: 0.735; acc: 0.657; precision: 0.651, recall: 0.677, macrof1: 0.657, weightedf1: 0.657\u001b[0m\n",
      "\u001b[92m2020-09-10 19:17:41\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=16.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:17:41\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_19:17:41 -- Epoch: 4/20; Train; loss: 0.338; acc: 0.875; precision: 0.857, recall: 0.900, macrof1: 0.875, weightedf1: 0.875\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:17:50\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_19:17:50 -- Epoch: 4/20; Valid; loss: 0.696; acc: 0.681; precision: 0.670, recall: 0.716, macrof1: 0.681, weightedf1: 0.681\u001b[0m\n",
      "\u001b[92m2020-09-10 19:17:50\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=16.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:17:51\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_19:17:51 -- Epoch: 5/20; Train; loss: 0.262; acc: 0.916; precision: 0.906, recall: 0.928, macrof1: 0.916, weightedf1: 0.916\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:18:00\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_19:18:00 -- Epoch: 5/20; Valid; loss: 0.677; acc: 0.702; precision: 0.698, recall: 0.714, macrof1: 0.702, weightedf1: 0.702\u001b[0m\n",
      "\u001b[92m2020-09-10 19:18:00\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=16.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:18:01\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_19:18:01 -- Epoch: 6/20; Train; loss: 0.205; acc: 0.947; precision: 0.944, recall: 0.950, macrof1: 0.947, weightedf1: 0.947\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:18:09\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_19:18:09 -- Epoch: 6/20; Valid; loss: 0.666; acc: 0.717; precision: 0.710, recall: 0.734, macrof1: 0.717, weightedf1: 0.717\u001b[0m\n",
      "\u001b[92m2020-09-10 19:18:09\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=16.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:18:10\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_19:18:10 -- Epoch: 7/20; Train; loss: 0.159; acc: 0.962; precision: 0.955, recall: 0.970, macrof1: 0.962, weightedf1: 0.962\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:18:19\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_19:18:19 -- Epoch: 7/20; Valid; loss: 0.657; acc: 0.727; precision: 0.721, recall: 0.742, macrof1: 0.727, weightedf1: 0.727\u001b[0m\n",
      "\u001b[92m2020-09-10 19:18:19\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=16.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:18:20\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_19:18:20 -- Epoch: 8/20; Train; loss: 0.123; acc: 0.982; precision: 0.976, recall: 0.988, macrof1: 0.982, weightedf1: 0.982\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:18:28\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_19:18:28 -- Epoch: 8/20; Valid; loss: 0.653; acc: 0.737; precision: 0.736, recall: 0.739, macrof1: 0.737, weightedf1: 0.737\u001b[0m\n",
      "\u001b[92m2020-09-10 19:18:28\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=16.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:18:29\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_19:18:29 -- Epoch: 9/20; Train; loss: 0.094; acc: 0.987; precision: 0.984, recall: 0.990, macrof1: 0.987, weightedf1: 0.987\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:18:38\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_19:18:38 -- Epoch: 9/20; Valid; loss: 0.655; acc: 0.745; precision: 0.741, recall: 0.755, macrof1: 0.745, weightedf1: 0.745\u001b[0m\n",
      "\u001b[92m2020-09-10 19:18:38\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model (early stopped) with least valid loss (checkpoint: 8) at ./models/wikigaz_en_ft_ocr_lstm_v002_n1000/wikigaz_en_ft_ocr_lstm_v002_n1000.model\u001b[0m\n",
      "\u001b[92m2020-09-10 19:18:38\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n",
      "\u001b[92m2020-09-10 19:18:38\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mEarly stopping at epoch: 9, selected epoch: 8\u001b[0m\n",
      "\n",
      "\n",
      "\n",
      "\n",
      "====================\n",
      "User time: 87.2856\n",
      "====================\n"
     ]
    }
   ],
   "source": [
    "from DeezyMatch import finetune as dm_finetune\n",
    "\n",
    "n_ft_examples = 1000\n",
    "\n",
    "# fine-tune a pretrained model stored at pretrained_model_path and pretrained_vocab_path \n",
    "dm_finetune(input_file_path=\"./inputs/input_dfm_lstm_model_B.yaml\", \n",
    "            dataset_path=\"./dataset/ocr_trainval.txt\", \n",
    "            model_name=f\"wikigaz_en_ft_ocr_lstm_v002_n{n_ft_examples}\",\n",
    "            pretrained_model_path=\"./models/wikigaz_en_lstm_001/wikigaz_en_lstm_001.model\", \n",
    "            pretrained_vocab_path=\"./models/wikigaz_en_lstm_001/wikigaz_en_lstm_001.vocab\",\n",
    "            n_train_examples=n_ft_examples\n",
    "           )"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 58,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:18:38\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mread input file: ./inputs/input_dfm_lstm_model_B.yaml\u001b[0m\n",
      "\u001b[92m2020-09-10 19:18:38\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32mpytorch will use: cuda:1\u001b[0m\n",
      "\u001b[92m2020-09-10 19:18:38\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mread CSV file: ./dataset/ocr_trainval.txt\u001b[0m\n",
      "\u001b[92m2020-09-10 19:18:39\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32mnumber of labels, True: 42301 and False: 42302\u001b[0m\n",
      "\u001b[92m2020-09-10 19:18:39\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mSplitting the Dataset\u001b[0m\n",
      "\u001b[92m2020-09-10 19:18:39\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mfinish splitting the Dataset. User time: 0.06361913681030273\u001b[0m\n",
      "\u001b[92m2020-09-10 19:18:39\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msplits are as follow:\n",
      "not_assigned    57221\n",
      "val             25380\n",
      "train            2000\n",
      "test                2\n",
      "Name: split, dtype: int64\u001b[0m\n",
      "\u001b[92m2020-09-10 19:18:39\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mstart creating a lookup table and convert characters to indices\u001b[0m\n",
      "\u001b[92m2020-09-10 19:18:39\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32m-- create vocabulary\u001b[0m\n",
      "\u001b[92m2020-09-10 19:18:40\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32m-- convert tokens to indices\u001b[0m\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "s1 padding:   0%|          | 0/25380 [00:00<?, ?it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:18:41\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mskipping 0 lines\u001b[0m\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "                                                                     \r"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n",
      "============================================================\n",
      "List all parameters in the model\n",
      "============================================================\n",
      "emb.weight False\n",
      "rnn_1.weight_ih_l0 True\n",
      "rnn_1.weight_hh_l0 True\n",
      "rnn_1.bias_ih_l0 True\n",
      "rnn_1.bias_hh_l0 True\n",
      "rnn_1.weight_ih_l0_reverse True\n",
      "rnn_1.weight_hh_l0_reverse True\n",
      "rnn_1.bias_ih_l0_reverse True\n",
      "rnn_1.bias_hh_l0_reverse True\n",
      "rnn_1.weight_ih_l1 True\n",
      "rnn_1.weight_hh_l1 True\n",
      "rnn_1.bias_ih_l1 True\n",
      "rnn_1.bias_hh_l1 True\n",
      "rnn_1.weight_ih_l1_reverse True\n",
      "rnn_1.weight_hh_l1_reverse True\n",
      "rnn_1.bias_ih_l1_reverse True\n",
      "rnn_1.bias_hh_l1_reverse True\n",
      "attn_step1.weight True\n",
      "attn_step1.bias True\n",
      "attn_step2.weight True\n",
      "attn_step2.bias True\n",
      "fc1.weight True\n",
      "fc1.bias True\n",
      "fc2.weight True\n",
      "fc2.bias True\n",
      "============================================================\n",
      "\n",
      "\n",
      "\n",
      "\u001b[92m2020-09-10 19:18:41\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m******************************\u001b[0m\n",
      "\u001b[92m2020-09-10 19:18:41\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m**** (Bi-directional) LSTM ****\u001b[0m\n",
      "\u001b[92m2020-09-10 19:18:41\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m******************************\u001b[0m\n",
      "\u001b[92m2020-09-10 19:18:41\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mNumber of batches: 32\u001b[0m\n",
      "\u001b[92m2020-09-10 19:18:41\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mNumber of epochs: 20\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "193fd37e738748bda8f82d2404fbcd4f",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=20.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=32.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n",
      "\n",
      "====================\n",
      "Total number of params: 721323\n",
      "\n",
      "two_parallel_rnns (\n",
      "  (emb): Embedding(7542, 60), weights=((7542, 60),), parameters=452520\n",
      "  (rnn_1): LSTM(60, 60, num_layers=2, dropout=0.01, bidirectional=True), weights=((240, 60), (240, 60), (240,), (240,), (240, 60), (240, 60), (240,), (240,), (240, 120), (240, 60), (240,), (240,), (240, 120), (240, 60), (240,), (240,)), parameters=145920\n",
      "  (attn_step1): Linear(in_features=120, out_features=60, bias=True), weights=((60, 120), (60,)), parameters=7260\n",
      "  (attn_step2): Linear(in_features=60, out_features=1, bias=True), weights=((1, 60), (1,)), parameters=61\n",
      "  (fc1): Linear(in_features=960, out_features=120, bias=True), weights=((120, 960), (120,)), parameters=115320\n",
      "  (fc2): Linear(in_features=120, out_features=2, bias=True), weights=((2, 120), (2,)), parameters=242\n",
      ")\n",
      "====================\n",
      "\n",
      "\n",
      "\u001b[92m2020-09-10 19:18:43\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_19:18:43 -- Epoch: 1/20; Train; loss: 1.142; acc: 0.568; precision: 0.565, recall: 0.593, macrof1: 0.568, weightedf1: 0.568\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:18:51\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_19:18:51 -- Epoch: 1/20; Valid; loss: 0.808; acc: 0.632; precision: 0.623, recall: 0.668, macrof1: 0.632, weightedf1: 0.632\u001b[0m\n",
      "\u001b[92m2020-09-10 19:18:51\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=32.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:18:53\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_19:18:53 -- Epoch: 2/20; Train; loss: 0.533; acc: 0.754; precision: 0.735, recall: 0.794, macrof1: 0.754, weightedf1: 0.754\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:19:02\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_19:19:02 -- Epoch: 2/20; Valid; loss: 0.639; acc: 0.698; precision: 0.695, recall: 0.705, macrof1: 0.698, weightedf1: 0.698\u001b[0m\n",
      "\u001b[92m2020-09-10 19:19:02\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=32.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:19:04\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_19:19:04 -- Epoch: 3/20; Train; loss: 0.387; acc: 0.843; precision: 0.833, recall: 0.858, macrof1: 0.843, weightedf1: 0.843\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:19:12\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_19:19:12 -- Epoch: 3/20; Valid; loss: 0.576; acc: 0.741; precision: 0.743, recall: 0.736, macrof1: 0.741, weightedf1: 0.741\u001b[0m\n",
      "\u001b[92m2020-09-10 19:19:12\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=32.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:19:14\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_19:19:14 -- Epoch: 4/20; Train; loss: 0.284; acc: 0.903; precision: 0.901, recall: 0.906, macrof1: 0.903, weightedf1: 0.903\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:19:22\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_19:19:22 -- Epoch: 4/20; Valid; loss: 0.539; acc: 0.770; precision: 0.764, recall: 0.781, macrof1: 0.770, weightedf1: 0.770\u001b[0m\n",
      "\u001b[92m2020-09-10 19:19:22\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=32.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:19:24\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_19:19:24 -- Epoch: 5/20; Train; loss: 0.216; acc: 0.935; precision: 0.932, recall: 0.938, macrof1: 0.935, weightedf1: 0.935\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:19:33\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_19:19:33 -- Epoch: 5/20; Valid; loss: 0.514; acc: 0.787; precision: 0.780, recall: 0.799, macrof1: 0.787, weightedf1: 0.787\u001b[0m\n",
      "\u001b[92m2020-09-10 19:19:33\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=32.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:19:34\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_19:19:34 -- Epoch: 6/20; Train; loss: 0.156; acc: 0.964; precision: 0.962, recall: 0.966, macrof1: 0.964, weightedf1: 0.964\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:19:43\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_19:19:43 -- Epoch: 6/20; Valid; loss: 0.501; acc: 0.800; precision: 0.801, recall: 0.797, macrof1: 0.800, weightedf1: 0.800\u001b[0m\n",
      "\u001b[92m2020-09-10 19:19:43\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=32.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:19:44\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_19:19:44 -- Epoch: 7/20; Train; loss: 0.111; acc: 0.977; precision: 0.977, recall: 0.976, macrof1: 0.976, weightedf1: 0.976\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:19:53\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_19:19:53 -- Epoch: 7/20; Valid; loss: 0.496; acc: 0.810; precision: 0.807, recall: 0.815, macrof1: 0.810, weightedf1: 0.810\u001b[0m\n",
      "\u001b[92m2020-09-10 19:19:53\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=32.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:19:55\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_19:19:55 -- Epoch: 8/20; Train; loss: 0.081; acc: 0.989; precision: 0.986, recall: 0.992, macrof1: 0.989, weightedf1: 0.989\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:20:03\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_19:20:03 -- Epoch: 8/20; Valid; loss: 0.497; acc: 0.815; precision: 0.815, recall: 0.816, macrof1: 0.815, weightedf1: 0.815\u001b[0m\n",
      "\u001b[92m2020-09-10 19:20:03\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model (early stopped) with least valid loss (checkpoint: 7) at ./models/wikigaz_en_ft_ocr_lstm_v002_n2000/wikigaz_en_ft_ocr_lstm_v002_n2000.model\u001b[0m\n",
      "\u001b[92m2020-09-10 19:20:03\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n",
      "\u001b[92m2020-09-10 19:20:03\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mEarly stopping at epoch: 8, selected epoch: 7\u001b[0m\n",
      "\n",
      "\n",
      "\n",
      "\n",
      "====================\n",
      "User time: 82.0991\n",
      "====================\n"
     ]
    }
   ],
   "source": [
    "from DeezyMatch import finetune as dm_finetune\n",
    "\n",
    "n_ft_examples = 2000\n",
    "\n",
    "# fine-tune a pretrained model stored at pretrained_model_path and pretrained_vocab_path \n",
    "dm_finetune(input_file_path=\"./inputs/input_dfm_lstm_model_B.yaml\", \n",
    "            dataset_path=\"./dataset/ocr_trainval.txt\", \n",
    "            model_name=f\"wikigaz_en_ft_ocr_lstm_v002_n{n_ft_examples}\",\n",
    "            pretrained_model_path=\"./models/wikigaz_en_lstm_001/wikigaz_en_lstm_001.model\", \n",
    "            pretrained_vocab_path=\"./models/wikigaz_en_lstm_001/wikigaz_en_lstm_001.vocab\",\n",
    "            n_train_examples=n_ft_examples\n",
    "           )"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 59,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:20:03\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mread input file: ./inputs/input_dfm_lstm_model_B.yaml\u001b[0m\n",
      "\u001b[92m2020-09-10 19:20:03\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32mpytorch will use: cuda:1\u001b[0m\n",
      "\u001b[92m2020-09-10 19:20:03\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mread CSV file: ./dataset/ocr_trainval.txt\u001b[0m\n",
      "\u001b[92m2020-09-10 19:20:04\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32mnumber of labels, True: 42301 and False: 42302\u001b[0m\n",
      "\u001b[92m2020-09-10 19:20:04\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mSplitting the Dataset\u001b[0m\n",
      "\u001b[92m2020-09-10 19:20:04\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mfinish splitting the Dataset. User time: 0.04894113540649414\u001b[0m\n",
      "\u001b[92m2020-09-10 19:20:04\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msplits are as follow:\n",
      "not_assigned    55221\n",
      "val             25380\n",
      "train            4000\n",
      "test                2\n",
      "Name: split, dtype: int64\u001b[0m\n",
      "\u001b[92m2020-09-10 19:20:04\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mstart creating a lookup table and convert characters to indices\u001b[0m\n",
      "\u001b[92m2020-09-10 19:20:04\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32m-- create vocabulary\u001b[0m\n",
      "\u001b[92m2020-09-10 19:20:05\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32m-- convert tokens to indices\u001b[0m\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "length s2:   0%|          | 0/25380 [00:00<?, ?it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:20:06\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mskipping 0 lines\u001b[0m\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "                                                                     \r"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n",
      "============================================================\n",
      "List all parameters in the model\n",
      "============================================================\n",
      "emb.weight False\n",
      "rnn_1.weight_ih_l0 True\n",
      "rnn_1.weight_hh_l0 True\n",
      "rnn_1.bias_ih_l0 True\n",
      "rnn_1.bias_hh_l0 True\n",
      "rnn_1.weight_ih_l0_reverse True\n",
      "rnn_1.weight_hh_l0_reverse True\n",
      "rnn_1.bias_ih_l0_reverse True\n",
      "rnn_1.bias_hh_l0_reverse True\n",
      "rnn_1.weight_ih_l1 True\n",
      "rnn_1.weight_hh_l1 True\n",
      "rnn_1.bias_ih_l1 True\n",
      "rnn_1.bias_hh_l1 True\n",
      "rnn_1.weight_ih_l1_reverse True\n",
      "rnn_1.weight_hh_l1_reverse True\n",
      "rnn_1.bias_ih_l1_reverse True\n",
      "rnn_1.bias_hh_l1_reverse True\n",
      "attn_step1.weight True\n",
      "attn_step1.bias True\n",
      "attn_step2.weight True\n",
      "attn_step2.bias True\n",
      "fc1.weight True\n",
      "fc1.bias True\n",
      "fc2.weight True\n",
      "fc2.bias True\n",
      "============================================================\n",
      "\n",
      "\n",
      "\n",
      "\u001b[92m2020-09-10 19:20:06\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m******************************\u001b[0m\n",
      "\u001b[92m2020-09-10 19:20:06\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m**** (Bi-directional) LSTM ****\u001b[0m\n",
      "\u001b[92m2020-09-10 19:20:06\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m******************************\u001b[0m\n",
      "\u001b[92m2020-09-10 19:20:06\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mNumber of batches: 63\u001b[0m\n",
      "\u001b[92m2020-09-10 19:20:06\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mNumber of epochs: 20\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "c07113badf074a3dbe9d7e9e057648b1",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=20.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=63.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n",
      "\n",
      "====================\n",
      "Total number of params: 721323\n",
      "\n",
      "two_parallel_rnns (\n",
      "  (emb): Embedding(7542, 60), weights=((7542, 60),), parameters=452520\n",
      "  (rnn_1): LSTM(60, 60, num_layers=2, dropout=0.01, bidirectional=True), weights=((240, 60), (240, 60), (240,), (240,), (240, 60), (240, 60), (240,), (240,), (240, 120), (240, 60), (240,), (240,), (240, 120), (240, 60), (240,), (240,)), parameters=145920\n",
      "  (attn_step1): Linear(in_features=120, out_features=60, bias=True), weights=((60, 120), (60,)), parameters=7260\n",
      "  (attn_step2): Linear(in_features=60, out_features=1, bias=True), weights=((1, 60), (1,)), parameters=61\n",
      "  (fc1): Linear(in_features=960, out_features=120, bias=True), weights=((120, 960), (120,)), parameters=115320\n",
      "  (fc2): Linear(in_features=120, out_features=2, bias=True), weights=((2, 120), (2,)), parameters=242\n",
      ")\n",
      "====================\n",
      "\n",
      "\n",
      "\u001b[92m2020-09-10 19:20:09\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_19:20:09 -- Epoch: 1/20; Train; loss: 0.899; acc: 0.624; precision: 0.617, recall: 0.657, macrof1: 0.624, weightedf1: 0.624\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:20:18\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_19:20:18 -- Epoch: 1/20; Valid; loss: 0.619; acc: 0.701; precision: 0.707, recall: 0.686, macrof1: 0.700, weightedf1: 0.700\u001b[0m\n",
      "\u001b[92m2020-09-10 19:20:18\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=63.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:20:21\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_19:20:21 -- Epoch: 2/20; Train; loss: 0.429; acc: 0.812; precision: 0.805, recall: 0.824, macrof1: 0.812, weightedf1: 0.812\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:20:30\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_19:20:30 -- Epoch: 2/20; Valid; loss: 0.494; acc: 0.778; precision: 0.765, recall: 0.802, macrof1: 0.778, weightedf1: 0.778\u001b[0m\n",
      "\u001b[92m2020-09-10 19:20:30\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=63.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:20:33\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_19:20:33 -- Epoch: 3/20; Train; loss: 0.294; acc: 0.885; precision: 0.881, recall: 0.892, macrof1: 0.885, weightedf1: 0.885\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:20:42\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_19:20:42 -- Epoch: 3/20; Valid; loss: 0.430; acc: 0.816; precision: 0.817, recall: 0.815, macrof1: 0.816, weightedf1: 0.816\u001b[0m\n",
      "\u001b[92m2020-09-10 19:20:42\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=63.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:20:45\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_19:20:45 -- Epoch: 4/20; Train; loss: 0.202; acc: 0.932; precision: 0.929, recall: 0.935, macrof1: 0.932, weightedf1: 0.932\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:20:53\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_19:20:53 -- Epoch: 4/20; Valid; loss: 0.398; acc: 0.837; precision: 0.833, recall: 0.844, macrof1: 0.837, weightedf1: 0.837\u001b[0m\n",
      "\u001b[92m2020-09-10 19:20:53\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=63.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:20:57\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_19:20:57 -- Epoch: 5/20; Train; loss: 0.139; acc: 0.964; precision: 0.960, recall: 0.968, macrof1: 0.963, weightedf1: 0.963\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:21:05\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_19:21:05 -- Epoch: 5/20; Valid; loss: 0.385; acc: 0.848; precision: 0.852, recall: 0.844, macrof1: 0.848, weightedf1: 0.848\u001b[0m\n",
      "\u001b[92m2020-09-10 19:21:05\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=63.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:21:08\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_19:21:08 -- Epoch: 6/20; Train; loss: 0.092; acc: 0.981; precision: 0.978, recall: 0.984, macrof1: 0.981, weightedf1: 0.981\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:21:17\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_19:21:17 -- Epoch: 6/20; Valid; loss: 0.384; acc: 0.854; precision: 0.861, recall: 0.844, macrof1: 0.854, weightedf1: 0.854\u001b[0m\n",
      "\u001b[92m2020-09-10 19:21:17\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=63.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:21:20\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_19:21:20 -- Epoch: 7/20; Train; loss: 0.059; acc: 0.993; precision: 0.992, recall: 0.993, macrof1: 0.992, weightedf1: 0.992\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:21:29\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_19:21:29 -- Epoch: 7/20; Valid; loss: 0.392; acc: 0.858; precision: 0.868, recall: 0.845, macrof1: 0.858, weightedf1: 0.858\u001b[0m\n",
      "\u001b[92m2020-09-10 19:21:29\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model (early stopped) with least valid loss (checkpoint: 6) at ./models/wikigaz_en_ft_ocr_lstm_v002_n4000/wikigaz_en_ft_ocr_lstm_v002_n4000.model\u001b[0m\n",
      "\u001b[92m2020-09-10 19:21:29\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n",
      "\u001b[92m2020-09-10 19:21:29\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mEarly stopping at epoch: 7, selected epoch: 6\u001b[0m\n",
      "\n",
      "\n",
      "\n",
      "\n",
      "====================\n",
      "User time: 82.6551\n",
      "====================\n"
     ]
    }
   ],
   "source": [
    "from DeezyMatch import finetune as dm_finetune\n",
    "\n",
    "n_ft_examples = 4000\n",
    "\n",
    "# fine-tune a pretrained model stored at pretrained_model_path and pretrained_vocab_path \n",
    "dm_finetune(input_file_path=\"./inputs/input_dfm_lstm_model_B.yaml\", \n",
    "            dataset_path=\"./dataset/ocr_trainval.txt\", \n",
    "            model_name=f\"wikigaz_en_ft_ocr_lstm_v002_n{n_ft_examples}\",\n",
    "            pretrained_model_path=\"./models/wikigaz_en_lstm_001/wikigaz_en_lstm_001.model\", \n",
    "            pretrained_vocab_path=\"./models/wikigaz_en_lstm_001/wikigaz_en_lstm_001.vocab\",\n",
    "            n_train_examples=n_ft_examples\n",
    "           )"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 60,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:21:29\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mread input file: ./inputs/input_dfm_lstm_model_B.yaml\u001b[0m\n",
      "\u001b[92m2020-09-10 19:21:29\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32mpytorch will use: cuda:1\u001b[0m\n",
      "\u001b[92m2020-09-10 19:21:29\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mread CSV file: ./dataset/ocr_trainval.txt\u001b[0m\n",
      "\u001b[92m2020-09-10 19:21:29\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32mnumber of labels, True: 42301 and False: 42302\u001b[0m\n",
      "\u001b[92m2020-09-10 19:21:29\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mSplitting the Dataset\u001b[0m\n",
      "\u001b[92m2020-09-10 19:21:29\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mfinish splitting the Dataset. User time: 0.04610013961791992\u001b[0m\n",
      "\u001b[92m2020-09-10 19:21:29\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msplits are as follow:\n",
      "not_assigned    51221\n",
      "val             25380\n",
      "train            8000\n",
      "test                2\n",
      "Name: split, dtype: int64\u001b[0m\n",
      "\u001b[92m2020-09-10 19:21:29\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mstart creating a lookup table and convert characters to indices\u001b[0m\n",
      "\u001b[92m2020-09-10 19:21:30\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32m-- create vocabulary\u001b[0m\n",
      "\u001b[92m2020-09-10 19:21:31\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32m-- convert tokens to indices\u001b[0m\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "length s1:   0%|          | 0/25380 [00:00<?, ?it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:21:31\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mskipping 0 lines\u001b[0m\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "                                                                     \r"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n",
      "============================================================\n",
      "List all parameters in the model\n",
      "============================================================\n",
      "emb.weight False\n",
      "rnn_1.weight_ih_l0 True\n",
      "rnn_1.weight_hh_l0 True\n",
      "rnn_1.bias_ih_l0 True\n",
      "rnn_1.bias_hh_l0 True\n",
      "rnn_1.weight_ih_l0_reverse True\n",
      "rnn_1.weight_hh_l0_reverse True\n",
      "rnn_1.bias_ih_l0_reverse True\n",
      "rnn_1.bias_hh_l0_reverse True\n",
      "rnn_1.weight_ih_l1 True\n",
      "rnn_1.weight_hh_l1 True\n",
      "rnn_1.bias_ih_l1 True\n",
      "rnn_1.bias_hh_l1 True\n",
      "rnn_1.weight_ih_l1_reverse True\n",
      "rnn_1.weight_hh_l1_reverse True\n",
      "rnn_1.bias_ih_l1_reverse True\n",
      "rnn_1.bias_hh_l1_reverse True\n",
      "attn_step1.weight True\n",
      "attn_step1.bias True\n",
      "attn_step2.weight True\n",
      "attn_step2.bias True\n",
      "fc1.weight True\n",
      "fc1.bias True\n",
      "fc2.weight True\n",
      "fc2.bias True\n",
      "============================================================\n",
      "\n",
      "\n",
      "\n",
      "\u001b[92m2020-09-10 19:21:32\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m******************************\u001b[0m\n",
      "\u001b[92m2020-09-10 19:21:32\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m**** (Bi-directional) LSTM ****\u001b[0m\n",
      "\u001b[92m2020-09-10 19:21:32\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m******************************\u001b[0m\n",
      "\u001b[92m2020-09-10 19:21:32\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mNumber of batches: 125\u001b[0m\n",
      "\u001b[92m2020-09-10 19:21:32\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mNumber of epochs: 20\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "a111e7678f6041bd98fbbd00d12d16ef",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=20.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=125.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n",
      "\n",
      "====================\n",
      "Total number of params: 721323\n",
      "\n",
      "two_parallel_rnns (\n",
      "  (emb): Embedding(7542, 60), weights=((7542, 60),), parameters=452520\n",
      "  (rnn_1): LSTM(60, 60, num_layers=2, dropout=0.01, bidirectional=True), weights=((240, 60), (240, 60), (240,), (240,), (240, 60), (240, 60), (240,), (240,), (240, 120), (240, 60), (240,), (240,), (240, 120), (240, 60), (240,), (240,)), parameters=145920\n",
      "  (attn_step1): Linear(in_features=120, out_features=60, bias=True), weights=((60, 120), (60,)), parameters=7260\n",
      "  (attn_step2): Linear(in_features=60, out_features=1, bias=True), weights=((1, 60), (1,)), parameters=61\n",
      "  (fc1): Linear(in_features=960, out_features=120, bias=True), weights=((120, 960), (120,)), parameters=115320\n",
      "  (fc2): Linear(in_features=120, out_features=2, bias=True), weights=((2, 120), (2,)), parameters=242\n",
      ")\n",
      "====================\n",
      "\n",
      "\n",
      "\u001b[92m2020-09-10 19:21:38\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_19:21:38 -- Epoch: 1/20; Train; loss: 0.715; acc: 0.688; precision: 0.678, recall: 0.715, macrof1: 0.688, weightedf1: 0.688\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:21:47\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_19:21:47 -- Epoch: 1/20; Valid; loss: 0.453; acc: 0.801; precision: 0.818, recall: 0.775, macrof1: 0.801, weightedf1: 0.801\u001b[0m\n",
      "\u001b[92m2020-09-10 19:21:47\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=125.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:21:53\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_19:21:53 -- Epoch: 2/20; Train; loss: 0.328; acc: 0.867; precision: 0.861, recall: 0.876, macrof1: 0.867, weightedf1: 0.867\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:22:02\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_19:22:02 -- Epoch: 2/20; Valid; loss: 0.334; acc: 0.863; precision: 0.860, recall: 0.868, macrof1: 0.863, weightedf1: 0.863\u001b[0m\n",
      "\u001b[92m2020-09-10 19:22:02\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=125.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:22:08\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_19:22:08 -- Epoch: 3/20; Train; loss: 0.211; acc: 0.925; precision: 0.915, recall: 0.936, macrof1: 0.925, weightedf1: 0.925\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:22:17\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_19:22:17 -- Epoch: 3/20; Valid; loss: 0.286; acc: 0.889; precision: 0.884, recall: 0.895, macrof1: 0.889, weightedf1: 0.889\u001b[0m\n",
      "\u001b[92m2020-09-10 19:22:17\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=125.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:22:24\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_19:22:24 -- Epoch: 4/20; Train; loss: 0.136; acc: 0.959; precision: 0.952, recall: 0.967, macrof1: 0.959, weightedf1: 0.959\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:22:32\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_19:22:32 -- Epoch: 4/20; Valid; loss: 0.270; acc: 0.898; precision: 0.892, recall: 0.905, macrof1: 0.898, weightedf1: 0.898\u001b[0m\n",
      "\u001b[92m2020-09-10 19:22:32\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=125.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:22:39\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_19:22:39 -- Epoch: 5/20; Train; loss: 0.084; acc: 0.979; precision: 0.975, recall: 0.983, macrof1: 0.979, weightedf1: 0.979\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:22:47\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_19:22:47 -- Epoch: 5/20; Valid; loss: 0.267; acc: 0.902; precision: 0.887, recall: 0.922, macrof1: 0.902, weightedf1: 0.902\u001b[0m\n",
      "\u001b[92m2020-09-10 19:22:47\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=125.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:22:54\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_19:22:54 -- Epoch: 6/20; Train; loss: 0.050; acc: 0.991; precision: 0.989, recall: 0.993, macrof1: 0.991, weightedf1: 0.991\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:23:03\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_19:23:03 -- Epoch: 6/20; Valid; loss: 0.274; acc: 0.905; precision: 0.904, recall: 0.906, macrof1: 0.905, weightedf1: 0.905\u001b[0m\n",
      "\u001b[92m2020-09-10 19:23:03\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model (early stopped) with least valid loss (checkpoint: 5) at ./models/wikigaz_en_ft_ocr_lstm_v002_n8000/wikigaz_en_ft_ocr_lstm_v002_n8000.model\u001b[0m\n",
      "\u001b[92m2020-09-10 19:23:03\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n",
      "\u001b[92m2020-09-10 19:23:03\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mEarly stopping at epoch: 6, selected epoch: 5\u001b[0m\n",
      "\n",
      "\n",
      "\n",
      "\n",
      "====================\n",
      "User time: 91.0168\n",
      "====================\n"
     ]
    }
   ],
   "source": [
    "from DeezyMatch import finetune as dm_finetune\n",
    "\n",
    "n_ft_examples = 8000\n",
    "\n",
    "# fine-tune a pretrained model stored at pretrained_model_path and pretrained_vocab_path \n",
    "dm_finetune(input_file_path=\"./inputs/input_dfm_lstm_model_B.yaml\", \n",
    "            dataset_path=\"./dataset/ocr_trainval.txt\", \n",
    "            model_name=f\"wikigaz_en_ft_ocr_lstm_v002_n{n_ft_examples}\",\n",
    "            pretrained_model_path=\"./models/wikigaz_en_lstm_001/wikigaz_en_lstm_001.model\", \n",
    "            pretrained_vocab_path=\"./models/wikigaz_en_lstm_001/wikigaz_en_lstm_001.vocab\",\n",
    "            n_train_examples=n_ft_examples\n",
    "           )"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 61,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:23:03\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mread input file: ./inputs/input_dfm_lstm_model_B.yaml\u001b[0m\n",
      "\u001b[92m2020-09-10 19:23:03\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32mpytorch will use: cuda:1\u001b[0m\n",
      "\u001b[92m2020-09-10 19:23:03\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mread CSV file: ./dataset/ocr_trainval.txt\u001b[0m\n",
      "\u001b[92m2020-09-10 19:23:03\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32mnumber of labels, True: 42301 and False: 42302\u001b[0m\n",
      "\u001b[92m2020-09-10 19:23:03\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mSplitting the Dataset\u001b[0m\n",
      "\u001b[92m2020-09-10 19:23:03\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mfinish splitting the Dataset. User time: 0.05091238021850586\u001b[0m\n",
      "\u001b[92m2020-09-10 19:23:03\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msplits are as follow:\n",
      "not_assigned    43221\n",
      "val             25380\n",
      "train           16000\n",
      "test                2\n",
      "Name: split, dtype: int64\u001b[0m\n",
      "\u001b[92m2020-09-10 19:23:03\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mstart creating a lookup table and convert characters to indices\u001b[0m\n",
      "\u001b[92m2020-09-10 19:23:04\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32m-- create vocabulary\u001b[0m\n",
      "\u001b[92m2020-09-10 19:23:05\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32m-- convert tokens to indices\u001b[0m\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "s1 padding:   0%|          | 0/16000 [00:00<?, ?it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:23:05\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mskipping 0 lines\u001b[0m\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "                                                                     \r"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n",
      "============================================================\n",
      "List all parameters in the model\n",
      "============================================================\n",
      "emb.weight False\n",
      "rnn_1.weight_ih_l0 True\n",
      "rnn_1.weight_hh_l0 True\n",
      "rnn_1.bias_ih_l0 True\n",
      "rnn_1.bias_hh_l0 True\n",
      "rnn_1.weight_ih_l0_reverse True\n",
      "rnn_1.weight_hh_l0_reverse True\n",
      "rnn_1.bias_ih_l0_reverse True\n",
      "rnn_1.bias_hh_l0_reverse True\n",
      "rnn_1.weight_ih_l1 True\n",
      "rnn_1.weight_hh_l1 True\n",
      "rnn_1.bias_ih_l1 True\n",
      "rnn_1.bias_hh_l1 True\n",
      "rnn_1.weight_ih_l1_reverse True\n",
      "rnn_1.weight_hh_l1_reverse True\n",
      "rnn_1.bias_ih_l1_reverse True\n",
      "rnn_1.bias_hh_l1_reverse True\n",
      "attn_step1.weight True\n",
      "attn_step1.bias True\n",
      "attn_step2.weight True\n",
      "attn_step2.bias True\n",
      "fc1.weight True\n",
      "fc1.bias True\n",
      "fc2.weight True\n",
      "fc2.bias True\n",
      "============================================================\n",
      "\n",
      "\n",
      "\n",
      "\u001b[92m2020-09-10 19:23:06\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m******************************\u001b[0m\n",
      "\u001b[92m2020-09-10 19:23:06\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m**** (Bi-directional) LSTM ****\u001b[0m\n",
      "\u001b[92m2020-09-10 19:23:06\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m******************************\u001b[0m\n",
      "\u001b[92m2020-09-10 19:23:06\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mNumber of batches: 250\u001b[0m\n",
      "\u001b[92m2020-09-10 19:23:06\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mNumber of epochs: 20\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "7f6fd044e1d4485c80e136ce9b38d2dc",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=20.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=250.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n",
      "\n",
      "====================\n",
      "Total number of params: 721323\n",
      "\n",
      "two_parallel_rnns (\n",
      "  (emb): Embedding(7542, 60), weights=((7542, 60),), parameters=452520\n",
      "  (rnn_1): LSTM(60, 60, num_layers=2, dropout=0.01, bidirectional=True), weights=((240, 60), (240, 60), (240,), (240,), (240, 60), (240, 60), (240,), (240,), (240, 120), (240, 60), (240,), (240,), (240, 120), (240, 60), (240,), (240,)), parameters=145920\n",
      "  (attn_step1): Linear(in_features=120, out_features=60, bias=True), weights=((60, 120), (60,)), parameters=7260\n",
      "  (attn_step2): Linear(in_features=60, out_features=1, bias=True), weights=((1, 60), (1,)), parameters=61\n",
      "  (fc1): Linear(in_features=960, out_features=120, bias=True), weights=((120, 960), (120,)), parameters=115320\n",
      "  (fc2): Linear(in_features=120, out_features=2, bias=True), weights=((2, 120), (2,)), parameters=242\n",
      ")\n",
      "====================\n",
      "\n",
      "\n",
      "\u001b[92m2020-09-10 19:23:19\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_19:23:19 -- Epoch: 1/20; Train; loss: 0.557; acc: 0.760; precision: 0.750, recall: 0.780, macrof1: 0.760, weightedf1: 0.760\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:23:28\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_19:23:28 -- Epoch: 1/20; Valid; loss: 0.324; acc: 0.866; precision: 0.858, recall: 0.877, macrof1: 0.866, weightedf1: 0.866\u001b[0m\n",
      "\u001b[92m2020-09-10 19:23:28\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=250.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:23:41\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_19:23:41 -- Epoch: 2/20; Train; loss: 0.238; acc: 0.910; precision: 0.901, recall: 0.922, macrof1: 0.910, weightedf1: 0.910\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:23:49\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_19:23:49 -- Epoch: 2/20; Valid; loss: 0.242; acc: 0.905; precision: 0.892, recall: 0.922, macrof1: 0.905, weightedf1: 0.905\u001b[0m\n",
      "\u001b[92m2020-09-10 19:23:49\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=250.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:24:02\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_19:24:02 -- Epoch: 3/20; Train; loss: 0.146; acc: 0.951; precision: 0.942, recall: 0.961, macrof1: 0.951, weightedf1: 0.951\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:24:11\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_19:24:11 -- Epoch: 3/20; Valid; loss: 0.217; acc: 0.919; precision: 0.908, recall: 0.932, macrof1: 0.919, weightedf1: 0.919\u001b[0m\n",
      "\u001b[92m2020-09-10 19:24:11\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=250.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:24:24\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_19:24:24 -- Epoch: 4/20; Train; loss: 0.089; acc: 0.974; precision: 0.968, recall: 0.980, macrof1: 0.974, weightedf1: 0.974\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:24:32\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_19:24:32 -- Epoch: 4/20; Valid; loss: 0.216; acc: 0.923; precision: 0.921, recall: 0.926, macrof1: 0.923, weightedf1: 0.923\u001b[0m\n",
      "\u001b[92m2020-09-10 19:24:32\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=250.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:24:45\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_19:24:45 -- Epoch: 5/20; Train; loss: 0.050; acc: 0.988; precision: 0.985, recall: 0.991, macrof1: 0.988, weightedf1: 0.988\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:24:54\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_19:24:54 -- Epoch: 5/20; Valid; loss: 0.221; acc: 0.926; precision: 0.921, recall: 0.931, macrof1: 0.926, weightedf1: 0.926\u001b[0m\n",
      "\u001b[92m2020-09-10 19:24:54\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model (early stopped) with least valid loss (checkpoint: 4) at ./models/wikigaz_en_ft_ocr_lstm_v002_n16000/wikigaz_en_ft_ocr_lstm_v002_n16000.model\u001b[0m\n",
      "\u001b[92m2020-09-10 19:24:54\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n",
      "\u001b[92m2020-09-10 19:24:54\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mEarly stopping at epoch: 5, selected epoch: 4\u001b[0m\n",
      "\n",
      "\n",
      "\n",
      "\n",
      "====================\n",
      "User time: 108.1344\n",
      "====================\n"
     ]
    }
   ],
   "source": [
    "from DeezyMatch import finetune as dm_finetune\n",
    "\n",
    "n_ft_examples = 16000\n",
    "\n",
    "# fine-tune a pretrained model stored at pretrained_model_path and pretrained_vocab_path \n",
    "dm_finetune(input_file_path=\"./inputs/input_dfm_lstm_model_B.yaml\", \n",
    "            dataset_path=\"./dataset/ocr_trainval.txt\", \n",
    "            model_name=f\"wikigaz_en_ft_ocr_lstm_v002_n{n_ft_examples}\",\n",
    "            pretrained_model_path=\"./models/wikigaz_en_lstm_001/wikigaz_en_lstm_001.model\", \n",
    "            pretrained_vocab_path=\"./models/wikigaz_en_lstm_001/wikigaz_en_lstm_001.vocab\",\n",
    "            n_train_examples=n_ft_examples\n",
    "           )"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 62,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:24:54\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mread input file: ./inputs/input_dfm_lstm_model_B.yaml\u001b[0m\n",
      "\u001b[92m2020-09-10 19:24:54\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32mpytorch will use: cuda:1\u001b[0m\n",
      "\u001b[92m2020-09-10 19:24:54\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mread CSV file: ./dataset/ocr_trainval.txt\u001b[0m\n",
      "\u001b[92m2020-09-10 19:24:55\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32mnumber of labels, True: 42301 and False: 42302\u001b[0m\n",
      "\u001b[92m2020-09-10 19:24:55\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mSplitting the Dataset\u001b[0m\n",
      "\u001b[92m2020-09-10 19:24:55\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mfinish splitting the Dataset. User time: 0.05561685562133789\u001b[0m\n",
      "\u001b[92m2020-09-10 19:24:55\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msplits are as follow:\n",
      "train           32000\n",
      "not_assigned    27221\n",
      "val             25380\n",
      "test                2\n",
      "Name: split, dtype: int64\u001b[0m\n",
      "\u001b[92m2020-09-10 19:24:55\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mstart creating a lookup table and convert characters to indices\u001b[0m\n",
      "\u001b[92m2020-09-10 19:24:55\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32m-- create vocabulary\u001b[0m\n",
      "\u001b[92m2020-09-10 19:24:56\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32m-- convert tokens to indices\u001b[0m\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "s1 padding:   0%|          | 0/32000 [00:00<?, ?it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:24:56\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mskipping 0 lines\u001b[0m\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "                                                                     \r"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n",
      "============================================================\n",
      "List all parameters in the model\n",
      "============================================================\n",
      "emb.weight False\n",
      "rnn_1.weight_ih_l0 True\n",
      "rnn_1.weight_hh_l0 True\n",
      "rnn_1.bias_ih_l0 True\n",
      "rnn_1.bias_hh_l0 True\n",
      "rnn_1.weight_ih_l0_reverse True\n",
      "rnn_1.weight_hh_l0_reverse True\n",
      "rnn_1.bias_ih_l0_reverse True\n",
      "rnn_1.bias_hh_l0_reverse True\n",
      "rnn_1.weight_ih_l1 True\n",
      "rnn_1.weight_hh_l1 True\n",
      "rnn_1.bias_ih_l1 True\n",
      "rnn_1.bias_hh_l1 True\n",
      "rnn_1.weight_ih_l1_reverse True\n",
      "rnn_1.weight_hh_l1_reverse True\n",
      "rnn_1.bias_ih_l1_reverse True\n",
      "rnn_1.bias_hh_l1_reverse True\n",
      "attn_step1.weight True\n",
      "attn_step1.bias True\n",
      "attn_step2.weight True\n",
      "attn_step2.bias True\n",
      "fc1.weight True\n",
      "fc1.bias True\n",
      "fc2.weight True\n",
      "fc2.bias True\n",
      "============================================================\n",
      "\n",
      "\n",
      "\n",
      "\u001b[92m2020-09-10 19:24:57\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m******************************\u001b[0m\n",
      "\u001b[92m2020-09-10 19:24:57\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m**** (Bi-directional) LSTM ****\u001b[0m\n",
      "\u001b[92m2020-09-10 19:24:57\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m******************************\u001b[0m\n",
      "\u001b[92m2020-09-10 19:24:57\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mNumber of batches: 500\u001b[0m\n",
      "\u001b[92m2020-09-10 19:24:57\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mNumber of epochs: 20\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "2a01d58f63e74ddd895caffbd21104a1",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=20.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=500.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n",
      "\n",
      "====================\n",
      "Total number of params: 721323\n",
      "\n",
      "two_parallel_rnns (\n",
      "  (emb): Embedding(7542, 60), weights=((7542, 60),), parameters=452520\n",
      "  (rnn_1): LSTM(60, 60, num_layers=2, dropout=0.01, bidirectional=True), weights=((240, 60), (240, 60), (240,), (240,), (240, 60), (240, 60), (240,), (240,), (240, 120), (240, 60), (240,), (240,), (240, 120), (240, 60), (240,), (240,)), parameters=145920\n",
      "  (attn_step1): Linear(in_features=120, out_features=60, bias=True), weights=((60, 120), (60,)), parameters=7260\n",
      "  (attn_step2): Linear(in_features=60, out_features=1, bias=True), weights=((1, 60), (1,)), parameters=61\n",
      "  (fc1): Linear(in_features=960, out_features=120, bias=True), weights=((120, 960), (120,)), parameters=115320\n",
      "  (fc2): Linear(in_features=120, out_features=2, bias=True), weights=((2, 120), (2,)), parameters=242\n",
      ")\n",
      "====================\n",
      "\n",
      "\n",
      "\u001b[92m2020-09-10 19:25:24\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_19:25:24 -- Epoch: 1/20; Train; loss: 0.414; acc: 0.826; precision: 0.814, recall: 0.844, macrof1: 0.826, weightedf1: 0.826\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:25:33\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_19:25:33 -- Epoch: 1/20; Valid; loss: 0.241; acc: 0.909; precision: 0.902, recall: 0.918, macrof1: 0.909, weightedf1: 0.909\u001b[0m\n",
      "\u001b[92m2020-09-10 19:25:33\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=500.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:25:59\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_19:25:59 -- Epoch: 2/20; Train; loss: 0.171; acc: 0.938; precision: 0.928, recall: 0.948, macrof1: 0.938, weightedf1: 0.938\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:26:08\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_19:26:08 -- Epoch: 2/20; Valid; loss: 0.187; acc: 0.931; precision: 0.915, recall: 0.951, macrof1: 0.931, weightedf1: 0.931\u001b[0m\n",
      "\u001b[92m2020-09-10 19:26:08\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=500.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:26:34\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_19:26:34 -- Epoch: 3/20; Train; loss: 0.102; acc: 0.967; precision: 0.958, recall: 0.976, macrof1: 0.967, weightedf1: 0.967\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:26:43\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_19:26:43 -- Epoch: 3/20; Valid; loss: 0.179; acc: 0.938; precision: 0.928, recall: 0.950, macrof1: 0.938, weightedf1: 0.938\u001b[0m\n",
      "\u001b[92m2020-09-10 19:26:43\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=500.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:27:09\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_19:27:09 -- Epoch: 4/20; Train; loss: 0.062; acc: 0.981; precision: 0.976, recall: 0.986, macrof1: 0.981, weightedf1: 0.981\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:27:18\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_19:27:18 -- Epoch: 4/20; Valid; loss: 0.186; acc: 0.940; precision: 0.932, recall: 0.950, macrof1: 0.940, weightedf1: 0.940\u001b[0m\n",
      "\u001b[92m2020-09-10 19:27:18\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model (early stopped) with least valid loss (checkpoint: 3) at ./models/wikigaz_en_ft_ocr_lstm_v002_n32000/wikigaz_en_ft_ocr_lstm_v002_n32000.model\u001b[0m\n",
      "\u001b[92m2020-09-10 19:27:18\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n",
      "\u001b[92m2020-09-10 19:27:18\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mEarly stopping at epoch: 4, selected epoch: 3\u001b[0m\n",
      "\n",
      "\n",
      "\n",
      "\n",
      "====================\n",
      "User time: 140.5165\n",
      "====================\n"
     ]
    }
   ],
   "source": [
    "from DeezyMatch import finetune as dm_finetune\n",
    "\n",
    "n_ft_examples = 32000\n",
    "\n",
    "# fine-tune a pretrained model stored at pretrained_model_path and pretrained_vocab_path \n",
    "dm_finetune(input_file_path=\"./inputs/input_dfm_lstm_model_B.yaml\", \n",
    "            dataset_path=\"./dataset/ocr_trainval.txt\", \n",
    "            model_name=f\"wikigaz_en_ft_ocr_lstm_v002_n{n_ft_examples}\",\n",
    "            pretrained_model_path=\"./models/wikigaz_en_lstm_001/wikigaz_en_lstm_001.model\", \n",
    "            pretrained_vocab_path=\"./models/wikigaz_en_lstm_001/wikigaz_en_lstm_001.vocab\",\n",
    "            n_train_examples=n_ft_examples\n",
    "           )"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 63,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:27:18\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mread input file: ./inputs/input_dfm_lstm_model_B_no_early_stopping.yaml\u001b[0m\n",
      "\u001b[92m2020-09-10 19:27:18\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32mpytorch will use: cuda:1\u001b[0m\n",
      "\u001b[92m2020-09-10 19:27:18\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mread CSV file: ./dataset/ocr_trainval.txt\u001b[0m\n",
      "\u001b[92m2020-09-10 19:27:19\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32mnumber of labels, True: 42301 and False: 42302\u001b[0m\n",
      "\u001b[92m2020-09-10 19:27:19\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mSplitting the Dataset\u001b[0m\n",
      "\u001b[92m2020-09-10 19:27:19\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mfinish splitting the Dataset. User time: 0.05287814140319824\u001b[0m\n",
      "\u001b[92m2020-09-10 19:27:19\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msplits are as follow:\n",
      "train    64000\n",
      "val      20603\n",
      "Name: split, dtype: int64\u001b[0m\n",
      "\u001b[92m2020-09-10 19:27:19\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mstart creating a lookup table and convert characters to indices\u001b[0m\n",
      "\u001b[92m2020-09-10 19:27:19\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32m-- create vocabulary\u001b[0m\n",
      "\u001b[92m2020-09-10 19:27:20\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32m-- convert tokens to indices\u001b[0m\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "length s2:   0%|          | 0/64000 [00:00<?, ?it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:27:21\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mskipping 0 lines\u001b[0m\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "                                                                     \r"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n",
      "============================================================\n",
      "List all parameters in the model\n",
      "============================================================\n",
      "emb.weight False\n",
      "rnn_1.weight_ih_l0 True\n",
      "rnn_1.weight_hh_l0 True\n",
      "rnn_1.bias_ih_l0 True\n",
      "rnn_1.bias_hh_l0 True\n",
      "rnn_1.weight_ih_l0_reverse True\n",
      "rnn_1.weight_hh_l0_reverse True\n",
      "rnn_1.bias_ih_l0_reverse True\n",
      "rnn_1.bias_hh_l0_reverse True\n",
      "rnn_1.weight_ih_l1 True\n",
      "rnn_1.weight_hh_l1 True\n",
      "rnn_1.bias_ih_l1 True\n",
      "rnn_1.bias_hh_l1 True\n",
      "rnn_1.weight_ih_l1_reverse True\n",
      "rnn_1.weight_hh_l1_reverse True\n",
      "rnn_1.bias_ih_l1_reverse True\n",
      "rnn_1.bias_hh_l1_reverse True\n",
      "attn_step1.weight True\n",
      "attn_step1.bias True\n",
      "attn_step2.weight True\n",
      "attn_step2.bias True\n",
      "fc1.weight True\n",
      "fc1.bias True\n",
      "fc2.weight True\n",
      "fc2.bias True\n",
      "============================================================\n",
      "\n",
      "\n",
      "\n",
      "\u001b[92m2020-09-10 19:27:22\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m******************************\u001b[0m\n",
      "\u001b[92m2020-09-10 19:27:22\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m**** (Bi-directional) LSTM ****\u001b[0m\n",
      "\u001b[92m2020-09-10 19:27:22\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m******************************\u001b[0m\n",
      "\u001b[92m2020-09-10 19:27:22\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mNumber of batches: 1000\u001b[0m\n",
      "\u001b[92m2020-09-10 19:27:22\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mNumber of epochs: 10\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "08eac835e3934dec97977d73fe77bad6",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=10.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=1000.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n",
      "\n",
      "====================\n",
      "Total number of params: 721323\n",
      "\n",
      "two_parallel_rnns (\n",
      "  (emb): Embedding(7542, 60), weights=((7542, 60),), parameters=452520\n",
      "  (rnn_1): LSTM(60, 60, num_layers=2, dropout=0.01, bidirectional=True), weights=((240, 60), (240, 60), (240,), (240,), (240, 60), (240, 60), (240,), (240,), (240, 120), (240, 60), (240,), (240,), (240, 120), (240, 60), (240,), (240,)), parameters=145920\n",
      "  (attn_step1): Linear(in_features=120, out_features=60, bias=True), weights=((60, 120), (60,)), parameters=7260\n",
      "  (attn_step2): Linear(in_features=60, out_features=1, bias=True), weights=((1, 60), (1,)), parameters=61\n",
      "  (fc1): Linear(in_features=960, out_features=120, bias=True), weights=((120, 960), (120,)), parameters=115320\n",
      "  (fc2): Linear(in_features=120, out_features=2, bias=True), weights=((2, 120), (2,)), parameters=242\n",
      ")\n",
      "====================\n",
      "\n",
      "\n",
      "\u001b[92m2020-09-10 19:28:14\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_19:28:14 -- Epoch: 1/10; Train; loss: 0.308; acc: 0.875; precision: 0.863, recall: 0.892, macrof1: 0.875, weightedf1: 0.875\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=322.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:28:21\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_19:28:21 -- Epoch: 1/10; Valid; loss: 0.167; acc: 0.936; precision: 0.915, recall: 0.962, macrof1: 0.936, weightedf1: 0.936\u001b[0m\n",
      "\u001b[92m2020-09-10 19:28:21\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=1000.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:29:12\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_19:29:12 -- Epoch: 2/10; Train; loss: 0.128; acc: 0.955; precision: 0.947, recall: 0.964, macrof1: 0.955, weightedf1: 0.955\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=322.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:29:19\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_19:29:19 -- Epoch: 2/10; Valid; loss: 0.138; acc: 0.949; precision: 0.932, recall: 0.969, macrof1: 0.949, weightedf1: 0.949\u001b[0m\n",
      "\u001b[92m2020-09-10 19:29:19\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=1000.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:30:10\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_19:30:10 -- Epoch: 3/10; Train; loss: 0.081; acc: 0.972; precision: 0.966, recall: 0.980, macrof1: 0.972, weightedf1: 0.972\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=322.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:30:18\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_19:30:18 -- Epoch: 3/10; Valid; loss: 0.122; acc: 0.958; precision: 0.944, recall: 0.973, macrof1: 0.958, weightedf1: 0.958\u001b[0m\n",
      "\u001b[92m2020-09-10 19:30:18\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=1000.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:31:09\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_19:31:09 -- Epoch: 4/10; Train; loss: 0.054; acc: 0.983; precision: 0.979, recall: 0.988, macrof1: 0.983, weightedf1: 0.983\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=322.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:31:16\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_19:31:16 -- Epoch: 4/10; Valid; loss: 0.127; acc: 0.954; precision: 0.957, recall: 0.950, macrof1: 0.954, weightedf1: 0.954\u001b[0m\n",
      "\u001b[92m2020-09-10 19:31:16\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=1000.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:32:08\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_19:32:08 -- Epoch: 5/10; Train; loss: 0.037; acc: 0.989; precision: 0.987, recall: 0.992, macrof1: 0.989, weightedf1: 0.989\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=322.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:32:16\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_19:32:16 -- Epoch: 5/10; Valid; loss: 0.134; acc: 0.959; precision: 0.948, recall: 0.972, macrof1: 0.959, weightedf1: 0.959\u001b[0m\n",
      "\u001b[92m2020-09-10 19:32:16\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=1000.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:33:07\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_19:33:07 -- Epoch: 6/10; Train; loss: 0.027; acc: 0.992; precision: 0.990, recall: 0.995, macrof1: 0.992, weightedf1: 0.992\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=322.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:33:14\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_19:33:14 -- Epoch: 6/10; Valid; loss: 0.143; acc: 0.959; precision: 0.958, recall: 0.960, macrof1: 0.959, weightedf1: 0.959\u001b[0m\n",
      "\u001b[92m2020-09-10 19:33:14\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=1000.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:34:06\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_19:34:06 -- Epoch: 7/10; Train; loss: 0.021; acc: 0.994; precision: 0.993, recall: 0.995, macrof1: 0.994, weightedf1: 0.994\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=322.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:34:13\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_19:34:13 -- Epoch: 7/10; Valid; loss: 0.155; acc: 0.960; precision: 0.948, recall: 0.974, macrof1: 0.960, weightedf1: 0.960\u001b[0m\n",
      "\u001b[92m2020-09-10 19:34:13\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=1000.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:35:04\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_19:35:04 -- Epoch: 8/10; Train; loss: 0.019; acc: 0.995; precision: 0.994, recall: 0.996, macrof1: 0.995, weightedf1: 0.995\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=322.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:35:11\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_19:35:11 -- Epoch: 8/10; Valid; loss: 0.161; acc: 0.960; precision: 0.957, recall: 0.963, macrof1: 0.960, weightedf1: 0.960\u001b[0m\n",
      "\u001b[92m2020-09-10 19:35:11\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=1000.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:36:02\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_19:36:02 -- Epoch: 9/10; Train; loss: 0.015; acc: 0.996; precision: 0.995, recall: 0.997, macrof1: 0.996, weightedf1: 0.996\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=322.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:36:09\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_19:36:09 -- Epoch: 9/10; Valid; loss: 0.169; acc: 0.960; precision: 0.947, recall: 0.975, macrof1: 0.960, weightedf1: 0.960\u001b[0m\n",
      "\u001b[92m2020-09-10 19:36:09\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=1000.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:37:01\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_19:37:01 -- Epoch: 10/10; Train; loss: 0.016; acc: 0.995; precision: 0.995, recall: 0.996, macrof1: 0.995, weightedf1: 0.995\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=322.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:37:08\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_19:37:08 -- Epoch: 10/10; Valid; loss: 0.186; acc: 0.958; precision: 0.942, recall: 0.975, macrof1: 0.958, weightedf1: 0.958\u001b[0m\n",
      "\u001b[92m2020-09-10 19:37:08\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n",
      "\n",
      "\u001b[92m2020-09-10 19:37:08\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model with least valid loss (checkpoint: 3) at ./models/wikigaz_en_ft_ocr_lstm_v002_n64000/wikigaz_en_ft_ocr_lstm_v002_n64000.model\u001b[0m\n",
      "\n",
      "\n",
      "\n",
      "====================\n",
      "User time: 586.1828\n",
      "====================\n"
     ]
    }
   ],
   "source": [
    "from DeezyMatch import finetune as dm_finetune\n",
    "\n",
    "n_ft_examples = 64000\n",
    "\n",
    "# fine-tune a pretrained model stored at pretrained_model_path and pretrained_vocab_path \n",
    "dm_finetune(input_file_path=\"./inputs/input_dfm_lstm_model_B_no_early_stopping.yaml\", \n",
    "            dataset_path=\"./dataset/ocr_trainval.txt\", \n",
    "            model_name=f\"wikigaz_en_ft_ocr_lstm_v002_n{n_ft_examples}\",\n",
    "            pretrained_model_path=\"./models/wikigaz_en_lstm_001/wikigaz_en_lstm_001.model\", \n",
    "            pretrained_vocab_path=\"./models/wikigaz_en_lstm_001/wikigaz_en_lstm_001.vocab\",\n",
    "            n_train_examples=n_ft_examples\n",
    "           )"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 64,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:37:08\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mread input file: ./inputs/input_dfm_lstm_model_B_no_early_stopping.yaml\u001b[0m\n",
      "\u001b[92m2020-09-10 19:37:08\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32mpytorch will use: cuda:1\u001b[0m\n",
      "\u001b[92m2020-09-10 19:37:08\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mread CSV file: ./dataset/ocr_trainval.txt\u001b[0m\n",
      "\u001b[92m2020-09-10 19:37:09\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32mnumber of labels, True: 42301 and False: 42302\u001b[0m\n",
      "\u001b[92m2020-09-10 19:37:09\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mSplitting the Dataset\u001b[0m\n",
      "\u001b[92m2020-09-10 19:37:09\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mfinish splitting the Dataset. User time: 0.051722049713134766\u001b[0m\n",
      "\u001b[92m2020-09-10 19:37:09\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msplits are as follow:\n",
      "train    84000\n",
      "val        603\n",
      "Name: split, dtype: int64\u001b[0m\n",
      "\u001b[92m2020-09-10 19:37:09\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mstart creating a lookup table and convert characters to indices\u001b[0m\n",
      "\u001b[92m2020-09-10 19:37:09\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32m-- create vocabulary\u001b[0m\n",
      "\u001b[92m2020-09-10 19:37:10\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32m-- convert tokens to indices\u001b[0m\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      "length s1:   0%|          | 0/84000 [00:00<?, ?it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:37:11\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mskipping 0 lines\u001b[0m\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "                                                                     \r"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n",
      "============================================================\n",
      "List all parameters in the model\n",
      "============================================================\n",
      "emb.weight False\n",
      "rnn_1.weight_ih_l0 True\n",
      "rnn_1.weight_hh_l0 True\n",
      "rnn_1.bias_ih_l0 True\n",
      "rnn_1.bias_hh_l0 True\n",
      "rnn_1.weight_ih_l0_reverse True\n",
      "rnn_1.weight_hh_l0_reverse True\n",
      "rnn_1.bias_ih_l0_reverse True\n",
      "rnn_1.bias_hh_l0_reverse True\n",
      "rnn_1.weight_ih_l1 True\n",
      "rnn_1.weight_hh_l1 True\n",
      "rnn_1.bias_ih_l1 True\n",
      "rnn_1.bias_hh_l1 True\n",
      "rnn_1.weight_ih_l1_reverse True\n",
      "rnn_1.weight_hh_l1_reverse True\n",
      "rnn_1.bias_ih_l1_reverse True\n",
      "rnn_1.bias_hh_l1_reverse True\n",
      "attn_step1.weight True\n",
      "attn_step1.bias True\n",
      "attn_step2.weight True\n",
      "attn_step2.bias True\n",
      "fc1.weight True\n",
      "fc1.bias True\n",
      "fc2.weight True\n",
      "fc2.bias True\n",
      "============================================================\n",
      "\n",
      "\n",
      "\n",
      "\u001b[92m2020-09-10 19:37:12\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m******************************\u001b[0m\n",
      "\u001b[92m2020-09-10 19:37:12\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m**** (Bi-directional) LSTM ****\u001b[0m\n",
      "\u001b[92m2020-09-10 19:37:12\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m******************************\u001b[0m\n",
      "\u001b[92m2020-09-10 19:37:12\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mNumber of batches: 1313\u001b[0m\n",
      "\u001b[92m2020-09-10 19:37:12\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mNumber of epochs: 10\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "32070cfb74c140c288f4da945a1f0392",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=10.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=1313.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n",
      "\n",
      "====================\n",
      "Total number of params: 721323\n",
      "\n",
      "two_parallel_rnns (\n",
      "  (emb): Embedding(7542, 60), weights=((7542, 60),), parameters=452520\n",
      "  (rnn_1): LSTM(60, 60, num_layers=2, dropout=0.01, bidirectional=True), weights=((240, 60), (240, 60), (240,), (240,), (240, 60), (240, 60), (240,), (240,), (240, 120), (240, 60), (240,), (240,), (240, 120), (240, 60), (240,), (240,)), parameters=145920\n",
      "  (attn_step1): Linear(in_features=120, out_features=60, bias=True), weights=((60, 120), (60,)), parameters=7260\n",
      "  (attn_step2): Linear(in_features=60, out_features=1, bias=True), weights=((1, 60), (1,)), parameters=61\n",
      "  (fc1): Linear(in_features=960, out_features=120, bias=True), weights=((120, 960), (120,)), parameters=115320\n",
      "  (fc2): Linear(in_features=120, out_features=2, bias=True), weights=((2, 120), (2,)), parameters=242\n",
      ")\n",
      "====================\n",
      "\n",
      "\n",
      "\u001b[92m2020-09-10 19:38:19\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_19:38:19 -- Epoch: 1/10; Train; loss: 0.268; acc: 0.894; precision: 0.882, recall: 0.909, macrof1: 0.894, weightedf1: 0.894\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=10.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:38:19\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_19:38:19 -- Epoch: 1/10; Valid; loss: 0.156; acc: 0.945; precision: 0.921, recall: 0.973, macrof1: 0.945, weightedf1: 0.945\u001b[0m\n",
      "\u001b[92m2020-09-10 19:38:19\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=1313.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:39:27\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_19:39:27 -- Epoch: 2/10; Train; loss: 0.114; acc: 0.961; precision: 0.952, recall: 0.970, macrof1: 0.961, weightedf1: 0.961\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=10.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:39:27\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_19:39:27 -- Epoch: 2/10; Valid; loss: 0.109; acc: 0.965; precision: 0.949, recall: 0.983, macrof1: 0.965, weightedf1: 0.965\u001b[0m\n",
      "\u001b[92m2020-09-10 19:39:27\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=1313.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:40:35\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_19:40:35 -- Epoch: 3/10; Train; loss: 0.074; acc: 0.975; precision: 0.969, recall: 0.982, macrof1: 0.975, weightedf1: 0.975\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=10.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:40:35\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_19:40:35 -- Epoch: 3/10; Valid; loss: 0.099; acc: 0.964; precision: 0.957, recall: 0.970, macrof1: 0.964, weightedf1: 0.964\u001b[0m\n",
      "\u001b[92m2020-09-10 19:40:35\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=1313.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:41:47\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_19:41:47 -- Epoch: 4/10; Train; loss: 0.049; acc: 0.984; precision: 0.980, recall: 0.989, macrof1: 0.984, weightedf1: 0.984\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=10.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:41:48\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_19:41:48 -- Epoch: 4/10; Valid; loss: 0.093; acc: 0.968; precision: 0.961, recall: 0.977, macrof1: 0.968, weightedf1: 0.968\u001b[0m\n",
      "\u001b[92m2020-09-10 19:41:48\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=1313.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:43:01\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_19:43:01 -- Epoch: 5/10; Train; loss: 0.035; acc: 0.989; precision: 0.987, recall: 0.992, macrof1: 0.989, weightedf1: 0.989\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=10.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:43:01\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_19:43:01 -- Epoch: 5/10; Valid; loss: 0.137; acc: 0.967; precision: 0.958, recall: 0.977, macrof1: 0.967, weightedf1: 0.967\u001b[0m\n",
      "\u001b[92m2020-09-10 19:43:01\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=1313.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:44:15\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_19:44:15 -- Epoch: 6/10; Train; loss: 0.027; acc: 0.992; precision: 0.990, recall: 0.994, macrof1: 0.992, weightedf1: 0.992\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=10.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:44:16\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_19:44:16 -- Epoch: 6/10; Valid; loss: 0.153; acc: 0.965; precision: 0.949, recall: 0.983, macrof1: 0.965, weightedf1: 0.965\u001b[0m\n",
      "\u001b[92m2020-09-10 19:44:16\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=1313.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:45:25\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_19:45:25 -- Epoch: 7/10; Train; loss: 0.023; acc: 0.993; precision: 0.992, recall: 0.995, macrof1: 0.993, weightedf1: 0.993\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=10.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:45:25\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_19:45:25 -- Epoch: 7/10; Valid; loss: 0.121; acc: 0.967; precision: 0.946, recall: 0.990, macrof1: 0.967, weightedf1: 0.967\u001b[0m\n",
      "\u001b[92m2020-09-10 19:45:25\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=1313.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:46:33\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_19:46:33 -- Epoch: 8/10; Train; loss: 0.019; acc: 0.995; precision: 0.993, recall: 0.996, macrof1: 0.995, weightedf1: 0.995\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=10.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:46:33\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_19:46:33 -- Epoch: 8/10; Valid; loss: 0.224; acc: 0.965; precision: 0.949, recall: 0.983, macrof1: 0.965, weightedf1: 0.965\u001b[0m\n",
      "\u001b[92m2020-09-10 19:46:33\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=1313.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:47:42\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_19:47:42 -- Epoch: 9/10; Train; loss: 0.018; acc: 0.995; precision: 0.994, recall: 0.996, macrof1: 0.995, weightedf1: 0.995\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=10.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:47:42\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_19:47:42 -- Epoch: 9/10; Valid; loss: 0.196; acc: 0.965; precision: 0.955, recall: 0.977, macrof1: 0.965, weightedf1: 0.965\u001b[0m\n",
      "\u001b[92m2020-09-10 19:47:42\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=1313.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:48:51\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_19:48:51 -- Epoch: 10/10; Train; loss: 0.016; acc: 0.996; precision: 0.995, recall: 0.997, macrof1: 0.996, weightedf1: 0.996\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=10.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:48:52\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_19:48:52 -- Epoch: 10/10; Valid; loss: 0.173; acc: 0.964; precision: 0.954, recall: 0.973, macrof1: 0.964, weightedf1: 0.964\u001b[0m\n",
      "\u001b[92m2020-09-10 19:48:52\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n",
      "\n",
      "\u001b[92m2020-09-10 19:48:52\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model with least valid loss (checkpoint: 4) at ./models/wikigaz_en_ft_ocr_lstm_v002_n84000/wikigaz_en_ft_ocr_lstm_v002_n84000.model\u001b[0m\n",
      "\n",
      "\n",
      "\n",
      "====================\n",
      "User time: 699.9799\n",
      "====================\n"
     ]
    }
   ],
   "source": [
    "from DeezyMatch import finetune as dm_finetune\n",
    "\n",
    "n_ft_examples = 84000\n",
    "\n",
    "# fine-tune a pretrained model stored at pretrained_model_path and pretrained_vocab_path \n",
    "dm_finetune(input_file_path=\"./inputs/input_dfm_lstm_model_B_no_early_stopping.yaml\", \n",
    "            dataset_path=\"./dataset/ocr_trainval.txt\", \n",
    "            model_name=f\"wikigaz_en_ft_ocr_lstm_v002_n{n_ft_examples}\",\n",
    "            pretrained_model_path=\"./models/wikigaz_en_lstm_001/wikigaz_en_lstm_001.model\", \n",
    "            pretrained_vocab_path=\"./models/wikigaz_en_lstm_001/wikigaz_en_lstm_001.vocab\",\n",
    "            n_train_examples=n_ft_examples\n",
    "           )"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Fine-Tune, model A, RNN"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 65,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:48:52\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mread input file: ./inputs/input_dfm_rnn_model_B.yaml\u001b[0m\n",
      "\u001b[92m2020-09-10 19:48:52\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32mpytorch will use: cuda:1\u001b[0m\n",
      "\u001b[92m2020-09-10 19:48:52\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mread CSV file: ./dataset/ocr_trainval.txt\u001b[0m\n",
      "\u001b[92m2020-09-10 19:48:52\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32mnumber of labels, True: 42301 and False: 42302\u001b[0m\n",
      "\u001b[92m2020-09-10 19:48:52\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mSplitting the Dataset\u001b[0m\n",
      "\u001b[92m2020-09-10 19:48:52\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mfinish splitting the Dataset. User time: 0.05935072898864746\u001b[0m\n",
      "\u001b[92m2020-09-10 19:48:52\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msplits are as follow:\n",
      "not_assigned    58971\n",
      "val             25380\n",
      "train             250\n",
      "test                2\n",
      "Name: split, dtype: int64\u001b[0m\n",
      "\u001b[92m2020-09-10 19:48:52\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mstart creating a lookup table and convert characters to indices\u001b[0m\n",
      "\u001b[92m2020-09-10 19:48:52\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32m-- create vocabulary\u001b[0m\n",
      "\u001b[92m2020-09-10 19:48:54\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32m-- convert tokens to indices\u001b[0m\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "s1 padding:   0%|          | 0/25380 [00:00<?, ?it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:48:54\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mskipping 0 lines\u001b[0m\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "                                                                     \r"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n",
      "============================================================\n",
      "List all parameters in the model\n",
      "============================================================\n",
      "emb.weight False\n",
      "rnn_1.weight_ih_l0 True\n",
      "rnn_1.weight_hh_l0 True\n",
      "rnn_1.bias_ih_l0 True\n",
      "rnn_1.bias_hh_l0 True\n",
      "rnn_1.weight_ih_l0_reverse True\n",
      "rnn_1.weight_hh_l0_reverse True\n",
      "rnn_1.bias_ih_l0_reverse True\n",
      "rnn_1.bias_hh_l0_reverse True\n",
      "rnn_1.weight_ih_l1 True\n",
      "rnn_1.weight_hh_l1 True\n",
      "rnn_1.bias_ih_l1 True\n",
      "rnn_1.bias_hh_l1 True\n",
      "rnn_1.weight_ih_l1_reverse True\n",
      "rnn_1.weight_hh_l1_reverse True\n",
      "rnn_1.bias_ih_l1_reverse True\n",
      "rnn_1.bias_hh_l1_reverse True\n",
      "attn_step1.weight True\n",
      "attn_step1.bias True\n",
      "attn_step2.weight True\n",
      "attn_step2.bias True\n",
      "fc1.weight True\n",
      "fc1.bias True\n",
      "fc2.weight True\n",
      "fc2.bias True\n",
      "============================================================\n",
      "\n",
      "\n",
      "\n",
      "\u001b[92m2020-09-10 19:48:55\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m******************************\u001b[0m\n",
      "\u001b[92m2020-09-10 19:48:55\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m**** (Bi-directional) RNN ****\u001b[0m\n",
      "\u001b[92m2020-09-10 19:48:55\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m******************************\u001b[0m\n",
      "\u001b[92m2020-09-10 19:48:55\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mNumber of batches: 4\u001b[0m\n",
      "\u001b[92m2020-09-10 19:48:55\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mNumber of epochs: 20\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "884c55ec2b9a435c869f5b17fbf9ce1c",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=20.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=4.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n",
      "\n",
      "====================\n",
      "Total number of params: 611883\n",
      "\n",
      "two_parallel_rnns (\n",
      "  (emb): Embedding(7542, 60), weights=((7542, 60),), parameters=452520\n",
      "  (rnn_1): RNN(60, 60, num_layers=2, dropout=0.01, bidirectional=True), weights=((60, 60), (60, 60), (60,), (60,), (60, 60), (60, 60), (60,), (60,), (60, 120), (60, 60), (60,), (60,), (60, 120), (60, 60), (60,), (60,)), parameters=36480\n",
      "  (attn_step1): Linear(in_features=120, out_features=60, bias=True), weights=((60, 120), (60,)), parameters=7260\n",
      "  (attn_step2): Linear(in_features=60, out_features=1, bias=True), weights=((1, 60), (1,)), parameters=61\n",
      "  (fc1): Linear(in_features=960, out_features=120, bias=True), weights=((120, 960), (120,)), parameters=115320\n",
      "  (fc2): Linear(in_features=120, out_features=2, bias=True), weights=((2, 120), (2,)), parameters=242\n",
      ")\n",
      "====================\n",
      "\n",
      "\n",
      "\u001b[92m2020-09-10 19:48:55\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_19:48:55 -- Epoch: 1/20; Train; loss: 1.026; acc: 0.520; precision: 0.520, recall: 0.528, macrof1: 0.520, weightedf1: 0.520\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:49:03\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_19:49:03 -- Epoch: 1/20; Valid; loss: 0.999; acc: 0.508; precision: 0.507, recall: 0.582, macrof1: 0.505, weightedf1: 0.505\u001b[0m\n",
      "\u001b[92m2020-09-10 19:49:03\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=4.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:49:03\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_19:49:03 -- Epoch: 2/20; Train; loss: 0.662; acc: 0.628; precision: 0.621, recall: 0.656, macrof1: 0.628, weightedf1: 0.628\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:49:12\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_19:49:12 -- Epoch: 2/20; Valid; loss: 0.896; acc: 0.528; precision: 0.523, recall: 0.614, macrof1: 0.524, weightedf1: 0.524\u001b[0m\n",
      "\u001b[92m2020-09-10 19:49:12\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=4.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:49:12\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_19:49:12 -- Epoch: 3/20; Train; loss: 0.518; acc: 0.740; precision: 0.724, recall: 0.776, macrof1: 0.740, weightedf1: 0.740\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:49:20\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_19:49:20 -- Epoch: 3/20; Valid; loss: 0.845; acc: 0.542; precision: 0.536, recall: 0.632, macrof1: 0.538, weightedf1: 0.538\u001b[0m\n",
      "\u001b[92m2020-09-10 19:49:20\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=4.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:49:21\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_19:49:21 -- Epoch: 4/20; Train; loss: 0.427; acc: 0.776; precision: 0.745, recall: 0.840, macrof1: 0.775, weightedf1: 0.775\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:49:29\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_19:49:29 -- Epoch: 4/20; Valid; loss: 0.824; acc: 0.557; precision: 0.548, recall: 0.652, macrof1: 0.553, weightedf1: 0.553\u001b[0m\n",
      "\u001b[92m2020-09-10 19:49:29\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=4.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:49:29\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_19:49:29 -- Epoch: 5/20; Train; loss: 0.364; acc: 0.848; precision: 0.813, recall: 0.904, macrof1: 0.848, weightedf1: 0.848\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:49:37\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_19:49:37 -- Epoch: 5/20; Valid; loss: 0.825; acc: 0.574; precision: 0.563, recall: 0.658, macrof1: 0.571, weightedf1: 0.571\u001b[0m\n",
      "\u001b[92m2020-09-10 19:49:37\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model (early stopped) with least valid loss (checkpoint: 4) at ./models/wikigaz_en_ft_ocr_rnn_v002_n250/wikigaz_en_ft_ocr_rnn_v002_n250.model\u001b[0m\n",
      "\u001b[92m2020-09-10 19:49:37\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n",
      "\u001b[92m2020-09-10 19:49:37\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mEarly stopping at epoch: 5, selected epoch: 4\u001b[0m\n",
      "\n",
      "\n",
      "\n",
      "\n",
      "====================\n",
      "User time: 42.7793\n",
      "====================\n"
     ]
    }
   ],
   "source": [
    "from DeezyMatch import finetune as dm_finetune\n",
    "\n",
    "n_ft_examples = 250\n",
    "\n",
    "# fine-tune a pretrained model stored at pretrained_model_path and pretrained_vocab_path \n",
    "dm_finetune(input_file_path=\"./inputs/input_dfm_rnn_model_B.yaml\", \n",
    "            dataset_path=\"./dataset/ocr_trainval.txt\", \n",
    "            model_name=f\"wikigaz_en_ft_ocr_rnn_v002_n{n_ft_examples}\",\n",
    "            pretrained_model_path=\"./models/wikigaz_en_rnn_001/wikigaz_en_rnn_001.model\", \n",
    "            pretrained_vocab_path=\"./models/wikigaz_en_rnn_001/wikigaz_en_rnn_001.vocab\",\n",
    "            n_train_examples=n_ft_examples\n",
    "           )"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 66,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:49:37\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mread input file: ./inputs/input_dfm_rnn_model_B.yaml\u001b[0m\n",
      "\u001b[92m2020-09-10 19:49:37\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32mpytorch will use: cuda:1\u001b[0m\n",
      "\u001b[92m2020-09-10 19:49:37\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mread CSV file: ./dataset/ocr_trainval.txt\u001b[0m\n",
      "\u001b[92m2020-09-10 19:49:38\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32mnumber of labels, True: 42301 and False: 42302\u001b[0m\n",
      "\u001b[92m2020-09-10 19:49:38\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mSplitting the Dataset\u001b[0m\n",
      "\u001b[92m2020-09-10 19:49:38\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mfinish splitting the Dataset. User time: 0.05241036415100098\u001b[0m\n",
      "\u001b[92m2020-09-10 19:49:38\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msplits are as follow:\n",
      "not_assigned    58721\n",
      "val             25380\n",
      "train             500\n",
      "test                2\n",
      "Name: split, dtype: int64\u001b[0m\n",
      "\u001b[92m2020-09-10 19:49:38\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mstart creating a lookup table and convert characters to indices\u001b[0m\n",
      "\u001b[92m2020-09-10 19:49:38\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32m-- create vocabulary\u001b[0m\n",
      "\u001b[92m2020-09-10 19:49:39\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32m-- convert tokens to indices\u001b[0m\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "s1 padding:   0%|          | 0/25380 [00:00<?, ?it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:49:40\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mskipping 0 lines\u001b[0m\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "                                                                     \r"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n",
      "============================================================\n",
      "List all parameters in the model\n",
      "============================================================\n",
      "emb.weight False\n",
      "rnn_1.weight_ih_l0 True\n",
      "rnn_1.weight_hh_l0 True\n",
      "rnn_1.bias_ih_l0 True\n",
      "rnn_1.bias_hh_l0 True\n",
      "rnn_1.weight_ih_l0_reverse True\n",
      "rnn_1.weight_hh_l0_reverse True\n",
      "rnn_1.bias_ih_l0_reverse True\n",
      "rnn_1.bias_hh_l0_reverse True\n",
      "rnn_1.weight_ih_l1 True\n",
      "rnn_1.weight_hh_l1 True\n",
      "rnn_1.bias_ih_l1 True\n",
      "rnn_1.bias_hh_l1 True\n",
      "rnn_1.weight_ih_l1_reverse True\n",
      "rnn_1.weight_hh_l1_reverse True\n",
      "rnn_1.bias_ih_l1_reverse True\n",
      "rnn_1.bias_hh_l1_reverse True\n",
      "attn_step1.weight True\n",
      "attn_step1.bias True\n",
      "attn_step2.weight True\n",
      "attn_step2.bias True\n",
      "fc1.weight True\n",
      "fc1.bias True\n",
      "fc2.weight True\n",
      "fc2.bias True\n",
      "============================================================\n",
      "\n",
      "\n",
      "\n",
      "\u001b[92m2020-09-10 19:49:40\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m******************************\u001b[0m\n",
      "\u001b[92m2020-09-10 19:49:40\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m**** (Bi-directional) RNN ****\u001b[0m\n",
      "\u001b[92m2020-09-10 19:49:40\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m******************************\u001b[0m\n",
      "\u001b[92m2020-09-10 19:49:40\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mNumber of batches: 8\u001b[0m\n",
      "\u001b[92m2020-09-10 19:49:40\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mNumber of epochs: 20\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "c195e90c166d45ec9de257bf98b83fbe",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=20.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=8.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n",
      "\n",
      "====================\n",
      "Total number of params: 611883\n",
      "\n",
      "two_parallel_rnns (\n",
      "  (emb): Embedding(7542, 60), weights=((7542, 60),), parameters=452520\n",
      "  (rnn_1): RNN(60, 60, num_layers=2, dropout=0.01, bidirectional=True), weights=((60, 60), (60, 60), (60,), (60,), (60, 60), (60, 60), (60,), (60,), (60, 120), (60, 60), (60,), (60,), (60, 120), (60, 60), (60,), (60,)), parameters=36480\n",
      "  (attn_step1): Linear(in_features=120, out_features=60, bias=True), weights=((60, 120), (60,)), parameters=7260\n",
      "  (attn_step2): Linear(in_features=60, out_features=1, bias=True), weights=((1, 60), (1,)), parameters=61\n",
      "  (fc1): Linear(in_features=960, out_features=120, bias=True), weights=((120, 960), (120,)), parameters=115320\n",
      "  (fc2): Linear(in_features=120, out_features=2, bias=True), weights=((2, 120), (2,)), parameters=242\n",
      ")\n",
      "====================\n",
      "\n",
      "\n",
      "\u001b[92m2020-09-10 19:49:41\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_19:49:41 -- Epoch: 1/20; Train; loss: 1.019; acc: 0.500; precision: 0.500, recall: 0.532, macrof1: 0.499, weightedf1: 0.499\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:49:49\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_19:49:49 -- Epoch: 1/20; Valid; loss: 0.844; acc: 0.519; precision: 0.517, recall: 0.593, macrof1: 0.517, weightedf1: 0.517\u001b[0m\n",
      "\u001b[92m2020-09-10 19:49:49\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=8.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:49:49\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_19:49:49 -- Epoch: 2/20; Train; loss: 0.661; acc: 0.622; precision: 0.606, recall: 0.700, macrof1: 0.620, weightedf1: 0.620\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:49:57\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_19:49:57 -- Epoch: 2/20; Valid; loss: 0.733; acc: 0.553; precision: 0.543, recall: 0.676, macrof1: 0.547, weightedf1: 0.547\u001b[0m\n",
      "\u001b[92m2020-09-10 19:49:57\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=8.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:49:58\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_19:49:58 -- Epoch: 3/20; Train; loss: 0.566; acc: 0.702; precision: 0.674, recall: 0.784, macrof1: 0.700, weightedf1: 0.700\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:50:06\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_19:50:06 -- Epoch: 3/20; Valid; loss: 0.698; acc: 0.576; precision: 0.560, recall: 0.707, macrof1: 0.568, weightedf1: 0.568\u001b[0m\n",
      "\u001b[92m2020-09-10 19:50:06\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=8.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:50:06\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_19:50:06 -- Epoch: 4/20; Train; loss: 0.496; acc: 0.730; precision: 0.705, recall: 0.792, macrof1: 0.729, weightedf1: 0.729\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:50:15\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_19:50:15 -- Epoch: 4/20; Valid; loss: 0.688; acc: 0.591; precision: 0.578, recall: 0.673, macrof1: 0.588, weightedf1: 0.588\u001b[0m\n",
      "\u001b[92m2020-09-10 19:50:15\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=8.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:50:15\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_19:50:15 -- Epoch: 5/20; Train; loss: 0.433; acc: 0.782; precision: 0.785, recall: 0.776, macrof1: 0.782, weightedf1: 0.782\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:50:23\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_19:50:23 -- Epoch: 5/20; Valid; loss: 0.699; acc: 0.610; precision: 0.600, recall: 0.655, macrof1: 0.609, weightedf1: 0.609\u001b[0m\n",
      "\u001b[92m2020-09-10 19:50:23\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model (early stopped) with least valid loss (checkpoint: 4) at ./models/wikigaz_en_ft_ocr_rnn_v002_n500/wikigaz_en_ft_ocr_rnn_v002_n500.model\u001b[0m\n",
      "\u001b[92m2020-09-10 19:50:23\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n",
      "\u001b[92m2020-09-10 19:50:23\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mEarly stopping at epoch: 5, selected epoch: 4\u001b[0m\n",
      "\n",
      "\n",
      "\n",
      "\n",
      "====================\n",
      "User time: 42.7849\n",
      "====================\n"
     ]
    }
   ],
   "source": [
    "from DeezyMatch import finetune as dm_finetune\n",
    "\n",
    "n_ft_examples = 500\n",
    "\n",
    "# fine-tune a pretrained model stored at pretrained_model_path and pretrained_vocab_path \n",
    "dm_finetune(input_file_path=\"./inputs/input_dfm_rnn_model_B.yaml\", \n",
    "            dataset_path=\"./dataset/ocr_trainval.txt\", \n",
    "            model_name=f\"wikigaz_en_ft_ocr_rnn_v002_n{n_ft_examples}\",\n",
    "            pretrained_model_path=\"./models/wikigaz_en_rnn_001/wikigaz_en_rnn_001.model\", \n",
    "            pretrained_vocab_path=\"./models/wikigaz_en_rnn_001/wikigaz_en_rnn_001.vocab\",\n",
    "            n_train_examples=n_ft_examples\n",
    "           )"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 67,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:50:23\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mread input file: ./inputs/input_dfm_rnn_model_B.yaml\u001b[0m\n",
      "\u001b[92m2020-09-10 19:50:23\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32mpytorch will use: cuda:1\u001b[0m\n",
      "\u001b[92m2020-09-10 19:50:23\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mread CSV file: ./dataset/ocr_trainval.txt\u001b[0m\n",
      "\u001b[92m2020-09-10 19:50:24\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32mnumber of labels, True: 42301 and False: 42302\u001b[0m\n",
      "\u001b[92m2020-09-10 19:50:24\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mSplitting the Dataset\u001b[0m\n",
      "\u001b[92m2020-09-10 19:50:24\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mfinish splitting the Dataset. User time: 0.05342698097229004\u001b[0m\n",
      "\u001b[92m2020-09-10 19:50:24\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msplits are as follow:\n",
      "not_assigned    58221\n",
      "val             25380\n",
      "train            1000\n",
      "test                2\n",
      "Name: split, dtype: int64\u001b[0m\n",
      "\u001b[92m2020-09-10 19:50:24\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mstart creating a lookup table and convert characters to indices\u001b[0m\n",
      "\u001b[92m2020-09-10 19:50:24\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32m-- create vocabulary\u001b[0m\n",
      "\u001b[92m2020-09-10 19:50:25\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32m-- convert tokens to indices\u001b[0m\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "                                                   "
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:50:26\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mskipping 0 lines\u001b[0m\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "                                                                     \r"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n",
      "============================================================\n",
      "List all parameters in the model\n",
      "============================================================\n",
      "emb.weight False\n",
      "rnn_1.weight_ih_l0 True\n",
      "rnn_1.weight_hh_l0 True\n",
      "rnn_1.bias_ih_l0 True\n",
      "rnn_1.bias_hh_l0 True\n",
      "rnn_1.weight_ih_l0_reverse True\n",
      "rnn_1.weight_hh_l0_reverse True\n",
      "rnn_1.bias_ih_l0_reverse True\n",
      "rnn_1.bias_hh_l0_reverse True\n",
      "rnn_1.weight_ih_l1 True\n",
      "rnn_1.weight_hh_l1 True\n",
      "rnn_1.bias_ih_l1 True\n",
      "rnn_1.bias_hh_l1 True\n",
      "rnn_1.weight_ih_l1_reverse True\n",
      "rnn_1.weight_hh_l1_reverse True\n",
      "rnn_1.bias_ih_l1_reverse True\n",
      "rnn_1.bias_hh_l1_reverse True\n",
      "attn_step1.weight True\n",
      "attn_step1.bias True\n",
      "attn_step2.weight True\n",
      "attn_step2.bias True\n",
      "fc1.weight True\n",
      "fc1.bias True\n",
      "fc2.weight True\n",
      "fc2.bias True\n",
      "============================================================\n",
      "\n",
      "\n",
      "\n",
      "\u001b[92m2020-09-10 19:50:26\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m******************************\u001b[0m\n",
      "\u001b[92m2020-09-10 19:50:26\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m**** (Bi-directional) RNN ****\u001b[0m\n",
      "\u001b[92m2020-09-10 19:50:26\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m******************************\u001b[0m\n",
      "\u001b[92m2020-09-10 19:50:26\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mNumber of batches: 16\u001b[0m\n",
      "\u001b[92m2020-09-10 19:50:26\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mNumber of epochs: 20\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "cf5e57aede74483c96036bf2edeeb246",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=20.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=16.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n",
      "\n",
      "====================\n",
      "Total number of params: 611883\n",
      "\n",
      "two_parallel_rnns (\n",
      "  (emb): Embedding(7542, 60), weights=((7542, 60),), parameters=452520\n",
      "  (rnn_1): RNN(60, 60, num_layers=2, dropout=0.01, bidirectional=True), weights=((60, 60), (60, 60), (60,), (60,), (60, 60), (60, 60), (60,), (60,), (60, 120), (60, 60), (60,), (60,), (60, 120), (60, 60), (60,), (60,)), parameters=36480\n",
      "  (attn_step1): Linear(in_features=120, out_features=60, bias=True), weights=((60, 120), (60,)), parameters=7260\n",
      "  (attn_step2): Linear(in_features=60, out_features=1, bias=True), weights=((1, 60), (1,)), parameters=61\n",
      "  (fc1): Linear(in_features=960, out_features=120, bias=True), weights=((120, 960), (120,)), parameters=115320\n",
      "  (fc2): Linear(in_features=120, out_features=2, bias=True), weights=((2, 120), (2,)), parameters=242\n",
      ")\n",
      "====================\n",
      "\n",
      "\n",
      "\u001b[92m2020-09-10 19:50:27\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_19:50:27 -- Epoch: 1/20; Train; loss: 0.895; acc: 0.511; precision: 0.510, recall: 0.552, macrof1: 0.510, weightedf1: 0.510\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:50:35\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_19:50:35 -- Epoch: 1/20; Valid; loss: 0.737; acc: 0.550; precision: 0.542, recall: 0.641, macrof1: 0.546, weightedf1: 0.546\u001b[0m\n",
      "\u001b[92m2020-09-10 19:50:35\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=16.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:50:36\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_19:50:36 -- Epoch: 2/20; Train; loss: 0.615; acc: 0.657; precision: 0.627, recall: 0.776, macrof1: 0.652, weightedf1: 0.652\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:50:45\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_19:50:45 -- Epoch: 2/20; Valid; loss: 0.669; acc: 0.598; precision: 0.578, recall: 0.727, macrof1: 0.591, weightedf1: 0.591\u001b[0m\n",
      "\u001b[92m2020-09-10 19:50:45\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=16.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:50:45\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_19:50:45 -- Epoch: 3/20; Train; loss: 0.526; acc: 0.702; precision: 0.680, recall: 0.762, macrof1: 0.701, weightedf1: 0.701\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:50:54\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_19:50:54 -- Epoch: 3/20; Valid; loss: 0.642; acc: 0.631; precision: 0.619, recall: 0.685, macrof1: 0.630, weightedf1: 0.630\u001b[0m\n",
      "\u001b[92m2020-09-10 19:50:54\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=16.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:50:54\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_19:50:54 -- Epoch: 4/20; Train; loss: 0.437; acc: 0.777; precision: 0.767, recall: 0.796, macrof1: 0.777, weightedf1: 0.777\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:51:03\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_19:51:03 -- Epoch: 4/20; Valid; loss: 0.628; acc: 0.664; precision: 0.653, recall: 0.697, macrof1: 0.663, weightedf1: 0.663\u001b[0m\n",
      "\u001b[92m2020-09-10 19:51:03\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=16.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:51:04\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_19:51:04 -- Epoch: 5/20; Train; loss: 0.352; acc: 0.846; precision: 0.837, recall: 0.860, macrof1: 0.846, weightedf1: 0.846\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:51:12\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_19:51:12 -- Epoch: 5/20; Valid; loss: 0.624; acc: 0.691; precision: 0.700, recall: 0.668, macrof1: 0.691, weightedf1: 0.691\u001b[0m\n",
      "\u001b[92m2020-09-10 19:51:12\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=16.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:51:12\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_19:51:12 -- Epoch: 6/20; Train; loss: 0.264; acc: 0.894; precision: 0.917, recall: 0.866, macrof1: 0.894, weightedf1: 0.894\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:51:21\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_19:51:21 -- Epoch: 6/20; Valid; loss: 0.646; acc: 0.715; precision: 0.706, recall: 0.739, macrof1: 0.715, weightedf1: 0.715\u001b[0m\n",
      "\u001b[92m2020-09-10 19:51:21\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model (early stopped) with least valid loss (checkpoint: 5) at ./models/wikigaz_en_ft_ocr_rnn_v002_n1000/wikigaz_en_ft_ocr_rnn_v002_n1000.model\u001b[0m\n",
      "\u001b[92m2020-09-10 19:51:21\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n",
      "\u001b[92m2020-09-10 19:51:21\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mEarly stopping at epoch: 6, selected epoch: 5\u001b[0m\n",
      "\n",
      "\n",
      "\n",
      "\n",
      "====================\n",
      "User time: 54.3822\n",
      "====================\n"
     ]
    }
   ],
   "source": [
    "from DeezyMatch import finetune as dm_finetune\n",
    "\n",
    "n_ft_examples = 1000\n",
    "\n",
    "# fine-tune a pretrained model stored at pretrained_model_path and pretrained_vocab_path \n",
    "dm_finetune(input_file_path=\"./inputs/input_dfm_rnn_model_B.yaml\", \n",
    "            dataset_path=\"./dataset/ocr_trainval.txt\", \n",
    "            model_name=f\"wikigaz_en_ft_ocr_rnn_v002_n{n_ft_examples}\",\n",
    "            pretrained_model_path=\"./models/wikigaz_en_rnn_001/wikigaz_en_rnn_001.model\", \n",
    "            pretrained_vocab_path=\"./models/wikigaz_en_rnn_001/wikigaz_en_rnn_001.vocab\",\n",
    "            n_train_examples=n_ft_examples\n",
    "           )"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 68,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:51:21\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mread input file: ./inputs/input_dfm_rnn_model_B.yaml\u001b[0m\n",
      "\u001b[92m2020-09-10 19:51:21\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32mpytorch will use: cuda:1\u001b[0m\n",
      "\u001b[92m2020-09-10 19:51:21\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mread CSV file: ./dataset/ocr_trainval.txt\u001b[0m\n",
      "\u001b[92m2020-09-10 19:51:21\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32mnumber of labels, True: 42301 and False: 42302\u001b[0m\n",
      "\u001b[92m2020-09-10 19:51:21\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mSplitting the Dataset\u001b[0m\n",
      "\u001b[92m2020-09-10 19:51:21\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mfinish splitting the Dataset. User time: 0.05343437194824219\u001b[0m\n",
      "\u001b[92m2020-09-10 19:51:21\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msplits are as follow:\n",
      "not_assigned    57221\n",
      "val             25380\n",
      "train            2000\n",
      "test                2\n",
      "Name: split, dtype: int64\u001b[0m\n",
      "\u001b[92m2020-09-10 19:51:21\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mstart creating a lookup table and convert characters to indices\u001b[0m\n",
      "\u001b[92m2020-09-10 19:51:21\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32m-- create vocabulary\u001b[0m\n",
      "\u001b[92m2020-09-10 19:51:22\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32m-- convert tokens to indices\u001b[0m\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "                                                    "
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:51:23\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mskipping 0 lines\u001b[0m\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "                                                                     \r"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n",
      "============================================================\n",
      "List all parameters in the model\n",
      "============================================================\n",
      "emb.weight False\n",
      "rnn_1.weight_ih_l0 True\n",
      "rnn_1.weight_hh_l0 True\n",
      "rnn_1.bias_ih_l0 True\n",
      "rnn_1.bias_hh_l0 True\n",
      "rnn_1.weight_ih_l0_reverse True\n",
      "rnn_1.weight_hh_l0_reverse True\n",
      "rnn_1.bias_ih_l0_reverse True\n",
      "rnn_1.bias_hh_l0_reverse True\n",
      "rnn_1.weight_ih_l1 True\n",
      "rnn_1.weight_hh_l1 True\n",
      "rnn_1.bias_ih_l1 True\n",
      "rnn_1.bias_hh_l1 True\n",
      "rnn_1.weight_ih_l1_reverse True\n",
      "rnn_1.weight_hh_l1_reverse True\n",
      "rnn_1.bias_ih_l1_reverse True\n",
      "rnn_1.bias_hh_l1_reverse True\n",
      "attn_step1.weight True\n",
      "attn_step1.bias True\n",
      "attn_step2.weight True\n",
      "attn_step2.bias True\n",
      "fc1.weight True\n",
      "fc1.bias True\n",
      "fc2.weight True\n",
      "fc2.bias True\n",
      "============================================================\n",
      "\n",
      "\n",
      "\n",
      "\u001b[92m2020-09-10 19:51:23\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m******************************\u001b[0m\n",
      "\u001b[92m2020-09-10 19:51:23\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m**** (Bi-directional) RNN ****\u001b[0m\n",
      "\u001b[92m2020-09-10 19:51:23\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m******************************\u001b[0m\n",
      "\u001b[92m2020-09-10 19:51:23\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mNumber of batches: 32\u001b[0m\n",
      "\u001b[92m2020-09-10 19:51:23\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mNumber of epochs: 20\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "f4c9d3ca48594b86915f1a56489eb476",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=20.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=32.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n",
      "\n",
      "====================\n",
      "Total number of params: 611883\n",
      "\n",
      "two_parallel_rnns (\n",
      "  (emb): Embedding(7542, 60), weights=((7542, 60),), parameters=452520\n",
      "  (rnn_1): RNN(60, 60, num_layers=2, dropout=0.01, bidirectional=True), weights=((60, 60), (60, 60), (60,), (60,), (60, 60), (60, 60), (60,), (60,), (60, 120), (60, 60), (60,), (60,), (60, 120), (60, 60), (60,), (60,)), parameters=36480\n",
      "  (attn_step1): Linear(in_features=120, out_features=60, bias=True), weights=((60, 120), (60,)), parameters=7260\n",
      "  (attn_step2): Linear(in_features=60, out_features=1, bias=True), weights=((1, 60), (1,)), parameters=61\n",
      "  (fc1): Linear(in_features=960, out_features=120, bias=True), weights=((120, 960), (120,)), parameters=115320\n",
      "  (fc2): Linear(in_features=120, out_features=2, bias=True), weights=((2, 120), (2,)), parameters=242\n",
      ")\n",
      "====================\n",
      "\n",
      "\n",
      "\u001b[92m2020-09-10 19:51:25\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_19:51:25 -- Epoch: 1/20; Train; loss: 0.785; acc: 0.559; precision: 0.550, recall: 0.644, macrof1: 0.556, weightedf1: 0.556\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:51:33\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_19:51:33 -- Epoch: 1/20; Valid; loss: 0.654; acc: 0.602; precision: 0.576, recall: 0.775, macrof1: 0.589, weightedf1: 0.589\u001b[0m\n",
      "\u001b[92m2020-09-10 19:51:33\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=32.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:51:35\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_19:51:35 -- Epoch: 2/20; Train; loss: 0.559; acc: 0.686; precision: 0.661, recall: 0.766, macrof1: 0.685, weightedf1: 0.685\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:51:43\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_19:51:43 -- Epoch: 2/20; Valid; loss: 0.598; acc: 0.682; precision: 0.685, recall: 0.673, macrof1: 0.682, weightedf1: 0.682\u001b[0m\n",
      "\u001b[92m2020-09-10 19:51:43\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=32.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:51:44\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_19:51:44 -- Epoch: 3/20; Train; loss: 0.445; acc: 0.777; precision: 0.783, recall: 0.768, macrof1: 0.777, weightedf1: 0.777\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:51:52\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_19:51:52 -- Epoch: 3/20; Valid; loss: 0.543; acc: 0.745; precision: 0.751, recall: 0.733, macrof1: 0.745, weightedf1: 0.745\u001b[0m\n",
      "\u001b[92m2020-09-10 19:51:52\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=32.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:51:54\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_19:51:54 -- Epoch: 4/20; Train; loss: 0.315; acc: 0.863; precision: 0.885, recall: 0.833, macrof1: 0.862, weightedf1: 0.862\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:52:02\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_19:52:02 -- Epoch: 4/20; Valid; loss: 0.531; acc: 0.767; precision: 0.765, recall: 0.772, macrof1: 0.767, weightedf1: 0.767\u001b[0m\n",
      "\u001b[92m2020-09-10 19:52:02\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=32.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:52:04\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_19:52:04 -- Epoch: 5/20; Train; loss: 0.218; acc: 0.916; precision: 0.930, recall: 0.901, macrof1: 0.916, weightedf1: 0.916\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:52:12\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_19:52:12 -- Epoch: 5/20; Valid; loss: 0.542; acc: 0.779; precision: 0.779, recall: 0.779, macrof1: 0.779, weightedf1: 0.779\u001b[0m\n",
      "\u001b[92m2020-09-10 19:52:12\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model (early stopped) with least valid loss (checkpoint: 4) at ./models/wikigaz_en_ft_ocr_rnn_v002_n2000/wikigaz_en_ft_ocr_rnn_v002_n2000.model\u001b[0m\n",
      "\u001b[92m2020-09-10 19:52:12\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n",
      "\u001b[92m2020-09-10 19:52:12\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mEarly stopping at epoch: 5, selected epoch: 4\u001b[0m\n",
      "\n",
      "\n",
      "\n",
      "\n",
      "====================\n",
      "User time: 48.5973\n",
      "====================\n"
     ]
    }
   ],
   "source": [
    "from DeezyMatch import finetune as dm_finetune\n",
    "\n",
    "n_ft_examples = 2000\n",
    "\n",
    "# fine-tune a pretrained model stored at pretrained_model_path and pretrained_vocab_path \n",
    "dm_finetune(input_file_path=\"./inputs/input_dfm_rnn_model_B.yaml\", \n",
    "            dataset_path=\"./dataset/ocr_trainval.txt\", \n",
    "            model_name=f\"wikigaz_en_ft_ocr_rnn_v002_n{n_ft_examples}\",\n",
    "            pretrained_model_path=\"./models/wikigaz_en_rnn_001/wikigaz_en_rnn_001.model\", \n",
    "            pretrained_vocab_path=\"./models/wikigaz_en_rnn_001/wikigaz_en_rnn_001.vocab\",\n",
    "            n_train_examples=n_ft_examples\n",
    "           )"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 69,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:52:12\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mread input file: ./inputs/input_dfm_rnn_model_B.yaml\u001b[0m\n",
      "\u001b[92m2020-09-10 19:52:12\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32mpytorch will use: cuda:1\u001b[0m\n",
      "\u001b[92m2020-09-10 19:52:12\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mread CSV file: ./dataset/ocr_trainval.txt\u001b[0m\n",
      "\u001b[92m2020-09-10 19:52:13\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32mnumber of labels, True: 42301 and False: 42302\u001b[0m\n",
      "\u001b[92m2020-09-10 19:52:13\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mSplitting the Dataset\u001b[0m\n",
      "\u001b[92m2020-09-10 19:52:13\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mfinish splitting the Dataset. User time: 0.059014320373535156\u001b[0m\n",
      "\u001b[92m2020-09-10 19:52:13\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msplits are as follow:\n",
      "not_assigned    55221\n",
      "val             25380\n",
      "train            4000\n",
      "test                2\n",
      "Name: split, dtype: int64\u001b[0m\n",
      "\u001b[92m2020-09-10 19:52:13\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mstart creating a lookup table and convert characters to indices\u001b[0m\n",
      "\u001b[92m2020-09-10 19:52:13\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32m-- create vocabulary\u001b[0m\n",
      "\u001b[92m2020-09-10 19:52:14\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32m-- convert tokens to indices\u001b[0m\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "length s2:   0%|          | 0/25380 [00:00<?, ?it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:52:15\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mskipping 0 lines\u001b[0m\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "                                                                     \r"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n",
      "============================================================\n",
      "List all parameters in the model\n",
      "============================================================\n",
      "emb.weight False\n",
      "rnn_1.weight_ih_l0 True\n",
      "rnn_1.weight_hh_l0 True\n",
      "rnn_1.bias_ih_l0 True\n",
      "rnn_1.bias_hh_l0 True\n",
      "rnn_1.weight_ih_l0_reverse True\n",
      "rnn_1.weight_hh_l0_reverse True\n",
      "rnn_1.bias_ih_l0_reverse True\n",
      "rnn_1.bias_hh_l0_reverse True\n",
      "rnn_1.weight_ih_l1 True\n",
      "rnn_1.weight_hh_l1 True\n",
      "rnn_1.bias_ih_l1 True\n",
      "rnn_1.bias_hh_l1 True\n",
      "rnn_1.weight_ih_l1_reverse True\n",
      "rnn_1.weight_hh_l1_reverse True\n",
      "rnn_1.bias_ih_l1_reverse True\n",
      "rnn_1.bias_hh_l1_reverse True\n",
      "attn_step1.weight True\n",
      "attn_step1.bias True\n",
      "attn_step2.weight True\n",
      "attn_step2.bias True\n",
      "fc1.weight True\n",
      "fc1.bias True\n",
      "fc2.weight True\n",
      "fc2.bias True\n",
      "============================================================\n",
      "\n",
      "\n",
      "\n",
      "\u001b[92m2020-09-10 19:52:15\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m******************************\u001b[0m\n",
      "\u001b[92m2020-09-10 19:52:15\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m**** (Bi-directional) RNN ****\u001b[0m\n",
      "\u001b[92m2020-09-10 19:52:15\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m******************************\u001b[0m\n",
      "\u001b[92m2020-09-10 19:52:15\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mNumber of batches: 63\u001b[0m\n",
      "\u001b[92m2020-09-10 19:52:15\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mNumber of epochs: 20\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "a88ad27e436447278f375bbb6de40300",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=20.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=63.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n",
      "\n",
      "====================\n",
      "Total number of params: 611883\n",
      "\n",
      "two_parallel_rnns (\n",
      "  (emb): Embedding(7542, 60), weights=((7542, 60),), parameters=452520\n",
      "  (rnn_1): RNN(60, 60, num_layers=2, dropout=0.01, bidirectional=True), weights=((60, 60), (60, 60), (60,), (60,), (60, 60), (60, 60), (60,), (60,), (60, 120), (60, 60), (60,), (60,), (60, 120), (60, 60), (60,), (60,)), parameters=36480\n",
      "  (attn_step1): Linear(in_features=120, out_features=60, bias=True), weights=((60, 120), (60,)), parameters=7260\n",
      "  (attn_step2): Linear(in_features=60, out_features=1, bias=True), weights=((1, 60), (1,)), parameters=61\n",
      "  (fc1): Linear(in_features=960, out_features=120, bias=True), weights=((120, 960), (120,)), parameters=115320\n",
      "  (fc2): Linear(in_features=120, out_features=2, bias=True), weights=((2, 120), (2,)), parameters=242\n",
      ")\n",
      "====================\n",
      "\n",
      "\n",
      "\u001b[92m2020-09-10 19:52:18\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_19:52:18 -- Epoch: 1/20; Train; loss: 0.684; acc: 0.605; precision: 0.590, recall: 0.689, macrof1: 0.602, weightedf1: 0.602\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:52:27\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_19:52:27 -- Epoch: 1/20; Valid; loss: 0.584; acc: 0.696; precision: 0.709, recall: 0.665, macrof1: 0.696, weightedf1: 0.696\u001b[0m\n",
      "\u001b[92m2020-09-10 19:52:27\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=63.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:52:30\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_19:52:30 -- Epoch: 2/20; Train; loss: 0.475; acc: 0.765; precision: 0.770, recall: 0.756, macrof1: 0.765, weightedf1: 0.765\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:52:38\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_19:52:38 -- Epoch: 2/20; Valid; loss: 0.492; acc: 0.770; precision: 0.781, recall: 0.750, macrof1: 0.770, weightedf1: 0.770\u001b[0m\n",
      "\u001b[92m2020-09-10 19:52:38\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=63.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:52:41\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_19:52:41 -- Epoch: 3/20; Train; loss: 0.341; acc: 0.849; precision: 0.848, recall: 0.851, macrof1: 0.849, weightedf1: 0.849\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:52:49\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_19:52:49 -- Epoch: 3/20; Valid; loss: 0.451; acc: 0.800; precision: 0.801, recall: 0.800, macrof1: 0.800, weightedf1: 0.800\u001b[0m\n",
      "\u001b[92m2020-09-10 19:52:49\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=63.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:52:52\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_19:52:52 -- Epoch: 4/20; Train; loss: 0.230; acc: 0.911; precision: 0.914, recall: 0.908, macrof1: 0.911, weightedf1: 0.911\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:53:00\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_19:53:00 -- Epoch: 4/20; Valid; loss: 0.459; acc: 0.811; precision: 0.819, recall: 0.799, macrof1: 0.811, weightedf1: 0.811\u001b[0m\n",
      "\u001b[92m2020-09-10 19:53:00\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model (early stopped) with least valid loss (checkpoint: 3) at ./models/wikigaz_en_ft_ocr_rnn_v002_n4000/wikigaz_en_ft_ocr_rnn_v002_n4000.model\u001b[0m\n",
      "\u001b[92m2020-09-10 19:53:00\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n",
      "\u001b[92m2020-09-10 19:53:00\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mEarly stopping at epoch: 4, selected epoch: 3\u001b[0m\n",
      "\n",
      "\n",
      "\n",
      "\n",
      "====================\n",
      "User time: 44.3694\n",
      "====================\n"
     ]
    }
   ],
   "source": [
    "from DeezyMatch import finetune as dm_finetune\n",
    "\n",
    "n_ft_examples = 4000\n",
    "\n",
    "# fine-tune a pretrained model stored at pretrained_model_path and pretrained_vocab_path \n",
    "dm_finetune(input_file_path=\"./inputs/input_dfm_rnn_model_B.yaml\", \n",
    "            dataset_path=\"./dataset/ocr_trainval.txt\", \n",
    "            model_name=f\"wikigaz_en_ft_ocr_rnn_v002_n{n_ft_examples}\",\n",
    "            pretrained_model_path=\"./models/wikigaz_en_rnn_001/wikigaz_en_rnn_001.model\", \n",
    "            pretrained_vocab_path=\"./models/wikigaz_en_rnn_001/wikigaz_en_rnn_001.vocab\",\n",
    "            n_train_examples=n_ft_examples\n",
    "           )"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 70,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:53:00\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mread input file: ./inputs/input_dfm_rnn_model_B.yaml\u001b[0m\n",
      "\u001b[92m2020-09-10 19:53:00\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32mpytorch will use: cuda:1\u001b[0m\n",
      "\u001b[92m2020-09-10 19:53:00\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mread CSV file: ./dataset/ocr_trainval.txt\u001b[0m\n",
      "\u001b[92m2020-09-10 19:53:00\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32mnumber of labels, True: 42301 and False: 42302\u001b[0m\n",
      "\u001b[92m2020-09-10 19:53:00\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mSplitting the Dataset\u001b[0m\n",
      "\u001b[92m2020-09-10 19:53:00\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mfinish splitting the Dataset. User time: 0.052842140197753906\u001b[0m\n",
      "\u001b[92m2020-09-10 19:53:00\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msplits are as follow:\n",
      "not_assigned    51221\n",
      "val             25380\n",
      "train            8000\n",
      "test                2\n",
      "Name: split, dtype: int64\u001b[0m\n",
      "\u001b[92m2020-09-10 19:53:00\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mstart creating a lookup table and convert characters to indices\u001b[0m\n",
      "\u001b[92m2020-09-10 19:53:01\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32m-- create vocabulary\u001b[0m\n",
      "\u001b[92m2020-09-10 19:53:02\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32m-- convert tokens to indices\u001b[0m\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "                                                    "
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:53:02\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mskipping 0 lines\u001b[0m\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "                                                                     \r"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n",
      "============================================================\n",
      "List all parameters in the model\n",
      "============================================================\n",
      "emb.weight False\n",
      "rnn_1.weight_ih_l0 True\n",
      "rnn_1.weight_hh_l0 True\n",
      "rnn_1.bias_ih_l0 True\n",
      "rnn_1.bias_hh_l0 True\n",
      "rnn_1.weight_ih_l0_reverse True\n",
      "rnn_1.weight_hh_l0_reverse True\n",
      "rnn_1.bias_ih_l0_reverse True\n",
      "rnn_1.bias_hh_l0_reverse True\n",
      "rnn_1.weight_ih_l1 True\n",
      "rnn_1.weight_hh_l1 True\n",
      "rnn_1.bias_ih_l1 True\n",
      "rnn_1.bias_hh_l1 True\n",
      "rnn_1.weight_ih_l1_reverse True\n",
      "rnn_1.weight_hh_l1_reverse True\n",
      "rnn_1.bias_ih_l1_reverse True\n",
      "rnn_1.bias_hh_l1_reverse True\n",
      "attn_step1.weight True\n",
      "attn_step1.bias True\n",
      "attn_step2.weight True\n",
      "attn_step2.bias True\n",
      "fc1.weight True\n",
      "fc1.bias True\n",
      "fc2.weight True\n",
      "fc2.bias True\n",
      "============================================================\n",
      "\n",
      "\n",
      "\n",
      "\u001b[92m2020-09-10 19:53:03\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m******************************\u001b[0m\n",
      "\u001b[92m2020-09-10 19:53:03\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m**** (Bi-directional) RNN ****\u001b[0m\n",
      "\u001b[92m2020-09-10 19:53:03\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m******************************\u001b[0m\n",
      "\u001b[92m2020-09-10 19:53:03\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mNumber of batches: 125\u001b[0m\n",
      "\u001b[92m2020-09-10 19:53:03\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mNumber of epochs: 20\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "bf1d07e92f32424d8b158a7af364a60b",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=20.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=125.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n",
      "\n",
      "====================\n",
      "Total number of params: 611883\n",
      "\n",
      "two_parallel_rnns (\n",
      "  (emb): Embedding(7542, 60), weights=((7542, 60),), parameters=452520\n",
      "  (rnn_1): RNN(60, 60, num_layers=2, dropout=0.01, bidirectional=True), weights=((60, 60), (60, 60), (60,), (60,), (60, 60), (60, 60), (60,), (60,), (60, 120), (60, 60), (60,), (60,), (60, 120), (60, 60), (60,), (60,)), parameters=36480\n",
      "  (attn_step1): Linear(in_features=120, out_features=60, bias=True), weights=((60, 120), (60,)), parameters=7260\n",
      "  (attn_step2): Linear(in_features=60, out_features=1, bias=True), weights=((1, 60), (1,)), parameters=61\n",
      "  (fc1): Linear(in_features=960, out_features=120, bias=True), weights=((120, 960), (120,)), parameters=115320\n",
      "  (fc2): Linear(in_features=120, out_features=2, bias=True), weights=((2, 120), (2,)), parameters=242\n",
      ")\n",
      "====================\n",
      "\n",
      "\n",
      "\u001b[92m2020-09-10 19:53:08\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_19:53:08 -- Epoch: 1/20; Train; loss: 0.616; acc: 0.668; precision: 0.652, recall: 0.721, macrof1: 0.668, weightedf1: 0.668\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:53:17\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_19:53:17 -- Epoch: 1/20; Valid; loss: 0.478; acc: 0.775; precision: 0.744, recall: 0.838, macrof1: 0.774, weightedf1: 0.774\u001b[0m\n",
      "\u001b[92m2020-09-10 19:53:17\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=125.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:53:22\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_19:53:22 -- Epoch: 2/20; Train; loss: 0.385; acc: 0.829; precision: 0.819, recall: 0.844, macrof1: 0.829, weightedf1: 0.829\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:53:30\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_19:53:30 -- Epoch: 2/20; Valid; loss: 0.400; acc: 0.821; precision: 0.795, recall: 0.865, macrof1: 0.821, weightedf1: 0.821\u001b[0m\n",
      "\u001b[92m2020-09-10 19:53:30\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=125.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:53:36\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_19:53:36 -- Epoch: 3/20; Train; loss: 0.270; acc: 0.890; precision: 0.882, recall: 0.900, macrof1: 0.890, weightedf1: 0.890\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:53:44\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_19:53:44 -- Epoch: 3/20; Valid; loss: 0.370; acc: 0.841; precision: 0.826, recall: 0.865, macrof1: 0.841, weightedf1: 0.841\u001b[0m\n",
      "\u001b[92m2020-09-10 19:53:44\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=125.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:53:50\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_19:53:50 -- Epoch: 4/20; Train; loss: 0.186; acc: 0.931; precision: 0.925, recall: 0.939, macrof1: 0.931, weightedf1: 0.931\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:53:58\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_19:53:58 -- Epoch: 4/20; Valid; loss: 0.380; acc: 0.848; precision: 0.818, recall: 0.895, macrof1: 0.847, weightedf1: 0.847\u001b[0m\n",
      "\u001b[92m2020-09-10 19:53:58\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model (early stopped) with least valid loss (checkpoint: 3) at ./models/wikigaz_en_ft_ocr_rnn_v002_n8000/wikigaz_en_ft_ocr_rnn_v002_n8000.model\u001b[0m\n",
      "\u001b[92m2020-09-10 19:53:58\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n",
      "\u001b[92m2020-09-10 19:53:58\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mEarly stopping at epoch: 4, selected epoch: 3\u001b[0m\n",
      "\n",
      "\n",
      "\n",
      "\n",
      "====================\n",
      "User time: 55.5615\n",
      "====================\n"
     ]
    }
   ],
   "source": [
    "from DeezyMatch import finetune as dm_finetune\n",
    "\n",
    "n_ft_examples = 8000\n",
    "\n",
    "# fine-tune a pretrained model stored at pretrained_model_path and pretrained_vocab_path \n",
    "dm_finetune(input_file_path=\"./inputs/input_dfm_rnn_model_B.yaml\", \n",
    "            dataset_path=\"./dataset/ocr_trainval.txt\", \n",
    "            model_name=f\"wikigaz_en_ft_ocr_rnn_v002_n{n_ft_examples}\",\n",
    "            pretrained_model_path=\"./models/wikigaz_en_rnn_001/wikigaz_en_rnn_001.model\", \n",
    "            pretrained_vocab_path=\"./models/wikigaz_en_rnn_001/wikigaz_en_rnn_001.vocab\",\n",
    "            n_train_examples=n_ft_examples\n",
    "           )"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 71,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:53:58\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mread input file: ./inputs/input_dfm_rnn_model_B.yaml\u001b[0m\n",
      "\u001b[92m2020-09-10 19:53:58\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32mpytorch will use: cuda:1\u001b[0m\n",
      "\u001b[92m2020-09-10 19:53:58\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mread CSV file: ./dataset/ocr_trainval.txt\u001b[0m\n",
      "\u001b[92m2020-09-10 19:53:59\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32mnumber of labels, True: 42301 and False: 42302\u001b[0m\n",
      "\u001b[92m2020-09-10 19:53:59\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mSplitting the Dataset\u001b[0m\n",
      "\u001b[92m2020-09-10 19:53:59\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mfinish splitting the Dataset. User time: 0.05814766883850098\u001b[0m\n",
      "\u001b[92m2020-09-10 19:53:59\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msplits are as follow:\n",
      "not_assigned    43221\n",
      "val             25380\n",
      "train           16000\n",
      "test                2\n",
      "Name: split, dtype: int64\u001b[0m\n",
      "\u001b[92m2020-09-10 19:53:59\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mstart creating a lookup table and convert characters to indices\u001b[0m\n",
      "\u001b[92m2020-09-10 19:53:59\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32m-- create vocabulary\u001b[0m\n",
      "\u001b[92m2020-09-10 19:54:00\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32m-- convert tokens to indices\u001b[0m\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "s1 padding:   0%|          | 0/16000 [00:00<?, ?it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:54:01\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mskipping 0 lines\u001b[0m\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "                                                                     \r"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n",
      "============================================================\n",
      "List all parameters in the model\n",
      "============================================================\n",
      "emb.weight False\n",
      "rnn_1.weight_ih_l0 True\n",
      "rnn_1.weight_hh_l0 True\n",
      "rnn_1.bias_ih_l0 True\n",
      "rnn_1.bias_hh_l0 True\n",
      "rnn_1.weight_ih_l0_reverse True\n",
      "rnn_1.weight_hh_l0_reverse True\n",
      "rnn_1.bias_ih_l0_reverse True\n",
      "rnn_1.bias_hh_l0_reverse True\n",
      "rnn_1.weight_ih_l1 True\n",
      "rnn_1.weight_hh_l1 True\n",
      "rnn_1.bias_ih_l1 True\n",
      "rnn_1.bias_hh_l1 True\n",
      "rnn_1.weight_ih_l1_reverse True\n",
      "rnn_1.weight_hh_l1_reverse True\n",
      "rnn_1.bias_ih_l1_reverse True\n",
      "rnn_1.bias_hh_l1_reverse True\n",
      "attn_step1.weight True\n",
      "attn_step1.bias True\n",
      "attn_step2.weight True\n",
      "attn_step2.bias True\n",
      "fc1.weight True\n",
      "fc1.bias True\n",
      "fc2.weight True\n",
      "fc2.bias True\n",
      "============================================================\n",
      "\n",
      "\n",
      "\n",
      "\u001b[92m2020-09-10 19:54:02\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m******************************\u001b[0m\n",
      "\u001b[92m2020-09-10 19:54:02\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m**** (Bi-directional) RNN ****\u001b[0m\n",
      "\u001b[92m2020-09-10 19:54:02\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m******************************\u001b[0m\n",
      "\u001b[92m2020-09-10 19:54:02\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mNumber of batches: 250\u001b[0m\n",
      "\u001b[92m2020-09-10 19:54:02\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mNumber of epochs: 20\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "70dc8596515c4005822c32d91cc6d250",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=20.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=250.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n",
      "\n",
      "====================\n",
      "Total number of params: 611883\n",
      "\n",
      "two_parallel_rnns (\n",
      "  (emb): Embedding(7542, 60), weights=((7542, 60),), parameters=452520\n",
      "  (rnn_1): RNN(60, 60, num_layers=2, dropout=0.01, bidirectional=True), weights=((60, 60), (60, 60), (60,), (60,), (60, 60), (60, 60), (60,), (60,), (60, 120), (60, 60), (60,), (60,), (60, 120), (60, 60), (60,), (60,)), parameters=36480\n",
      "  (attn_step1): Linear(in_features=120, out_features=60, bias=True), weights=((60, 120), (60,)), parameters=7260\n",
      "  (attn_step2): Linear(in_features=60, out_features=1, bias=True), weights=((1, 60), (1,)), parameters=61\n",
      "  (fc1): Linear(in_features=960, out_features=120, bias=True), weights=((120, 960), (120,)), parameters=115320\n",
      "  (fc2): Linear(in_features=120, out_features=2, bias=True), weights=((2, 120), (2,)), parameters=242\n",
      ")\n",
      "====================\n",
      "\n",
      "\n",
      "\u001b[92m2020-09-10 19:54:13\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_19:54:13 -- Epoch: 1/20; Train; loss: 0.528; acc: 0.732; precision: 0.715, recall: 0.772, macrof1: 0.732, weightedf1: 0.732\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:54:21\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_19:54:21 -- Epoch: 1/20; Valid; loss: 0.386; acc: 0.831; precision: 0.807, recall: 0.868, macrof1: 0.830, weightedf1: 0.830\u001b[0m\n",
      "\u001b[92m2020-09-10 19:54:21\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=250.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:54:32\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_19:54:32 -- Epoch: 2/20; Train; loss: 0.310; acc: 0.871; precision: 0.859, recall: 0.888, macrof1: 0.871, weightedf1: 0.871\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:54:40\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_19:54:40 -- Epoch: 2/20; Valid; loss: 0.327; acc: 0.863; precision: 0.875, recall: 0.848, macrof1: 0.863, weightedf1: 0.863\u001b[0m\n",
      "\u001b[92m2020-09-10 19:54:40\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=250.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:54:51\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_19:54:51 -- Epoch: 3/20; Train; loss: 0.218; acc: 0.916; precision: 0.908, recall: 0.925, macrof1: 0.916, weightedf1: 0.916\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:54:59\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_19:54:59 -- Epoch: 3/20; Valid; loss: 0.309; acc: 0.878; precision: 0.868, recall: 0.891, macrof1: 0.878, weightedf1: 0.878\u001b[0m\n",
      "\u001b[92m2020-09-10 19:54:59\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=250.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:55:10\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_19:55:10 -- Epoch: 4/20; Train; loss: 0.161; acc: 0.941; precision: 0.934, recall: 0.950, macrof1: 0.941, weightedf1: 0.941\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:55:18\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_19:55:18 -- Epoch: 4/20; Valid; loss: 0.300; acc: 0.887; precision: 0.886, recall: 0.889, macrof1: 0.887, weightedf1: 0.887\u001b[0m\n",
      "\u001b[92m2020-09-10 19:55:18\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=250.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:55:29\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_19:55:29 -- Epoch: 5/20; Train; loss: 0.116; acc: 0.958; precision: 0.953, recall: 0.964, macrof1: 0.958, weightedf1: 0.958\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:55:37\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_19:55:37 -- Epoch: 5/20; Valid; loss: 0.315; acc: 0.888; precision: 0.887, recall: 0.890, macrof1: 0.888, weightedf1: 0.888\u001b[0m\n",
      "\u001b[92m2020-09-10 19:55:37\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model (early stopped) with least valid loss (checkpoint: 4) at ./models/wikigaz_en_ft_ocr_rnn_v002_n16000/wikigaz_en_ft_ocr_rnn_v002_n16000.model\u001b[0m\n",
      "\u001b[92m2020-09-10 19:55:37\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n",
      "\u001b[92m2020-09-10 19:55:37\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mEarly stopping at epoch: 5, selected epoch: 4\u001b[0m\n",
      "\n",
      "\n",
      "\n",
      "\n",
      "====================\n",
      "User time: 95.3452\n",
      "====================\n"
     ]
    }
   ],
   "source": [
    "from DeezyMatch import finetune as dm_finetune\n",
    "\n",
    "n_ft_examples = 16000\n",
    "\n",
    "# fine-tune a pretrained model stored at pretrained_model_path and pretrained_vocab_path \n",
    "dm_finetune(input_file_path=\"./inputs/input_dfm_rnn_model_B.yaml\", \n",
    "            dataset_path=\"./dataset/ocr_trainval.txt\", \n",
    "            model_name=f\"wikigaz_en_ft_ocr_rnn_v002_n{n_ft_examples}\",\n",
    "            pretrained_model_path=\"./models/wikigaz_en_rnn_001/wikigaz_en_rnn_001.model\", \n",
    "            pretrained_vocab_path=\"./models/wikigaz_en_rnn_001/wikigaz_en_rnn_001.vocab\",\n",
    "            n_train_examples=n_ft_examples\n",
    "           )"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 72,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:55:37\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mread input file: ./inputs/input_dfm_rnn_model_B.yaml\u001b[0m\n",
      "\u001b[92m2020-09-10 19:55:37\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32mpytorch will use: cuda:1\u001b[0m\n",
      "\u001b[92m2020-09-10 19:55:37\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mread CSV file: ./dataset/ocr_trainval.txt\u001b[0m\n",
      "\u001b[92m2020-09-10 19:55:38\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32mnumber of labels, True: 42301 and False: 42302\u001b[0m\n",
      "\u001b[92m2020-09-10 19:55:38\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mSplitting the Dataset\u001b[0m\n",
      "\u001b[92m2020-09-10 19:55:38\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mfinish splitting the Dataset. User time: 0.04746866226196289\u001b[0m\n",
      "\u001b[92m2020-09-10 19:55:38\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msplits are as follow:\n",
      "train           32000\n",
      "not_assigned    27221\n",
      "val             25380\n",
      "test                2\n",
      "Name: split, dtype: int64\u001b[0m\n",
      "\u001b[92m2020-09-10 19:55:38\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mstart creating a lookup table and convert characters to indices\u001b[0m\n",
      "\u001b[92m2020-09-10 19:55:38\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32m-- create vocabulary\u001b[0m\n",
      "\u001b[92m2020-09-10 19:55:39\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32m-- convert tokens to indices\u001b[0m\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "s1 padding:   0%|          | 0/32000 [00:00<?, ?it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:55:39\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mskipping 0 lines\u001b[0m\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "                                                                     \r"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n",
      "============================================================\n",
      "List all parameters in the model\n",
      "============================================================\n",
      "emb.weight False\n",
      "rnn_1.weight_ih_l0 True\n",
      "rnn_1.weight_hh_l0 True\n",
      "rnn_1.bias_ih_l0 True\n",
      "rnn_1.bias_hh_l0 True\n",
      "rnn_1.weight_ih_l0_reverse True\n",
      "rnn_1.weight_hh_l0_reverse True\n",
      "rnn_1.bias_ih_l0_reverse True\n",
      "rnn_1.bias_hh_l0_reverse True\n",
      "rnn_1.weight_ih_l1 True\n",
      "rnn_1.weight_hh_l1 True\n",
      "rnn_1.bias_ih_l1 True\n",
      "rnn_1.bias_hh_l1 True\n",
      "rnn_1.weight_ih_l1_reverse True\n",
      "rnn_1.weight_hh_l1_reverse True\n",
      "rnn_1.bias_ih_l1_reverse True\n",
      "rnn_1.bias_hh_l1_reverse True\n",
      "attn_step1.weight True\n",
      "attn_step1.bias True\n",
      "attn_step2.weight True\n",
      "attn_step2.bias True\n",
      "fc1.weight True\n",
      "fc1.bias True\n",
      "fc2.weight True\n",
      "fc2.bias True\n",
      "============================================================\n",
      "\n",
      "\n",
      "\n",
      "\u001b[92m2020-09-10 19:55:40\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m******************************\u001b[0m\n",
      "\u001b[92m2020-09-10 19:55:40\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m**** (Bi-directional) RNN ****\u001b[0m\n",
      "\u001b[92m2020-09-10 19:55:40\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m******************************\u001b[0m\n",
      "\u001b[92m2020-09-10 19:55:40\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mNumber of batches: 500\u001b[0m\n",
      "\u001b[92m2020-09-10 19:55:40\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mNumber of epochs: 20\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "1e909a88802641a1928de01fc53e66eb",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=20.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=500.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n",
      "\n",
      "====================\n",
      "Total number of params: 611883\n",
      "\n",
      "two_parallel_rnns (\n",
      "  (emb): Embedding(7542, 60), weights=((7542, 60),), parameters=452520\n",
      "  (rnn_1): RNN(60, 60, num_layers=2, dropout=0.01, bidirectional=True), weights=((60, 60), (60, 60), (60,), (60,), (60, 60), (60, 60), (60,), (60,), (60, 120), (60, 60), (60,), (60,), (60, 120), (60, 60), (60,), (60,)), parameters=36480\n",
      "  (attn_step1): Linear(in_features=120, out_features=60, bias=True), weights=((60, 120), (60,)), parameters=7260\n",
      "  (attn_step2): Linear(in_features=60, out_features=1, bias=True), weights=((1, 60), (1,)), parameters=61\n",
      "  (fc1): Linear(in_features=960, out_features=120, bias=True), weights=((120, 960), (120,)), parameters=115320\n",
      "  (fc2): Linear(in_features=120, out_features=2, bias=True), weights=((2, 120), (2,)), parameters=242\n",
      ")\n",
      "====================\n",
      "\n",
      "\n",
      "\u001b[92m2020-09-10 19:56:02\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_19:56:02 -- Epoch: 1/20; Train; loss: 0.434; acc: 0.794; precision: 0.778, recall: 0.822, macrof1: 0.794, weightedf1: 0.794\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:56:10\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_19:56:10 -- Epoch: 1/20; Valid; loss: 0.315; acc: 0.865; precision: 0.835, recall: 0.910, macrof1: 0.865, weightedf1: 0.865\u001b[0m\n",
      "\u001b[92m2020-09-10 19:56:10\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=500.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:56:32\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_19:56:32 -- Epoch: 2/20; Train; loss: 0.255; acc: 0.898; precision: 0.888, recall: 0.910, macrof1: 0.898, weightedf1: 0.898\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:56:40\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_19:56:40 -- Epoch: 2/20; Valid; loss: 0.262; acc: 0.896; precision: 0.886, recall: 0.910, macrof1: 0.896, weightedf1: 0.896\u001b[0m\n",
      "\u001b[92m2020-09-10 19:56:40\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=500.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:57:03\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_19:57:03 -- Epoch: 3/20; Train; loss: 0.193; acc: 0.926; precision: 0.918, recall: 0.936, macrof1: 0.926, weightedf1: 0.926\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:57:12\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_19:57:12 -- Epoch: 3/20; Valid; loss: 0.241; acc: 0.905; precision: 0.898, recall: 0.914, macrof1: 0.905, weightedf1: 0.905\u001b[0m\n",
      "\u001b[92m2020-09-10 19:57:12\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=500.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:57:35\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_19:57:35 -- Epoch: 4/20; Train; loss: 0.154; acc: 0.941; precision: 0.934, recall: 0.949, macrof1: 0.941, weightedf1: 0.941\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:57:43\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_19:57:43 -- Epoch: 4/20; Valid; loss: 0.241; acc: 0.910; precision: 0.891, recall: 0.933, macrof1: 0.910, weightedf1: 0.910\u001b[0m\n",
      "\u001b[92m2020-09-10 19:57:43\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model (early stopped) with least valid loss (checkpoint: 3) at ./models/wikigaz_en_ft_ocr_rnn_v002_n32000/wikigaz_en_ft_ocr_rnn_v002_n32000.model\u001b[0m\n",
      "\u001b[92m2020-09-10 19:57:43\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n",
      "\u001b[92m2020-09-10 19:57:43\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mEarly stopping at epoch: 4, selected epoch: 3\u001b[0m\n",
      "\n",
      "\n",
      "\n",
      "\n",
      "====================\n",
      "User time: 123.0453\n",
      "====================\n"
     ]
    }
   ],
   "source": [
    "from DeezyMatch import finetune as dm_finetune\n",
    "\n",
    "n_ft_examples = 32000\n",
    "\n",
    "# fine-tune a pretrained model stored at pretrained_model_path and pretrained_vocab_path \n",
    "dm_finetune(input_file_path=\"./inputs/input_dfm_rnn_model_B.yaml\", \n",
    "            dataset_path=\"./dataset/ocr_trainval.txt\", \n",
    "            model_name=f\"wikigaz_en_ft_ocr_rnn_v002_n{n_ft_examples}\",\n",
    "            pretrained_model_path=\"./models/wikigaz_en_rnn_001/wikigaz_en_rnn_001.model\", \n",
    "            pretrained_vocab_path=\"./models/wikigaz_en_rnn_001/wikigaz_en_rnn_001.vocab\",\n",
    "            n_train_examples=n_ft_examples\n",
    "           )"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 73,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:57:43\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mread input file: ./inputs/input_dfm_rnn_model_B_no_early_stopping.yaml\u001b[0m\n",
      "\u001b[92m2020-09-10 19:57:43\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32mpytorch will use: cuda:1\u001b[0m\n",
      "\u001b[92m2020-09-10 19:57:43\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mread CSV file: ./dataset/ocr_trainval.txt\u001b[0m\n",
      "\u001b[92m2020-09-10 19:57:44\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32mnumber of labels, True: 42301 and False: 42302\u001b[0m\n",
      "\u001b[92m2020-09-10 19:57:44\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mSplitting the Dataset\u001b[0m\n",
      "\u001b[92m2020-09-10 19:57:44\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mfinish splitting the Dataset. User time: 0.0603642463684082\u001b[0m\n",
      "\u001b[92m2020-09-10 19:57:44\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msplits are as follow:\n",
      "train    64000\n",
      "val      20603\n",
      "Name: split, dtype: int64\u001b[0m\n",
      "\u001b[92m2020-09-10 19:57:44\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mstart creating a lookup table and convert characters to indices\u001b[0m\n",
      "\u001b[92m2020-09-10 19:57:44\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32m-- create vocabulary\u001b[0m\n",
      "\u001b[92m2020-09-10 19:57:45\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32m-- convert tokens to indices\u001b[0m\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "length s2:   0%|          | 0/64000 [00:00<?, ?it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:57:46\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mskipping 0 lines\u001b[0m\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "                                                                     \r"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n",
      "============================================================\n",
      "List all parameters in the model\n",
      "============================================================\n",
      "emb.weight False\n",
      "rnn_1.weight_ih_l0 True\n",
      "rnn_1.weight_hh_l0 True\n",
      "rnn_1.bias_ih_l0 True\n",
      "rnn_1.bias_hh_l0 True\n",
      "rnn_1.weight_ih_l0_reverse True\n",
      "rnn_1.weight_hh_l0_reverse True\n",
      "rnn_1.bias_ih_l0_reverse True\n",
      "rnn_1.bias_hh_l0_reverse True\n",
      "rnn_1.weight_ih_l1 True\n",
      "rnn_1.weight_hh_l1 True\n",
      "rnn_1.bias_ih_l1 True\n",
      "rnn_1.bias_hh_l1 True\n",
      "rnn_1.weight_ih_l1_reverse True\n",
      "rnn_1.weight_hh_l1_reverse True\n",
      "rnn_1.bias_ih_l1_reverse True\n",
      "rnn_1.bias_hh_l1_reverse True\n",
      "attn_step1.weight True\n",
      "attn_step1.bias True\n",
      "attn_step2.weight True\n",
      "attn_step2.bias True\n",
      "fc1.weight True\n",
      "fc1.bias True\n",
      "fc2.weight True\n",
      "fc2.bias True\n",
      "============================================================\n",
      "\n",
      "\n",
      "\n",
      "\u001b[92m2020-09-10 19:57:47\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m******************************\u001b[0m\n",
      "\u001b[92m2020-09-10 19:57:47\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m**** (Bi-directional) RNN ****\u001b[0m\n",
      "\u001b[92m2020-09-10 19:57:47\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m******************************\u001b[0m\n",
      "\u001b[92m2020-09-10 19:57:47\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mNumber of batches: 1000\u001b[0m\n",
      "\u001b[92m2020-09-10 19:57:47\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mNumber of epochs: 10\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "a9a16b2a0c4d430cbc66ad17f2ff6edd",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=10.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=1000.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n",
      "\n",
      "====================\n",
      "Total number of params: 611883\n",
      "\n",
      "two_parallel_rnns (\n",
      "  (emb): Embedding(7542, 60), weights=((7542, 60),), parameters=452520\n",
      "  (rnn_1): RNN(60, 60, num_layers=2, dropout=0.01, bidirectional=True), weights=((60, 60), (60, 60), (60,), (60,), (60, 60), (60, 60), (60,), (60,), (60, 120), (60, 60), (60,), (60,), (60, 120), (60, 60), (60,), (60,)), parameters=36480\n",
      "  (attn_step1): Linear(in_features=120, out_features=60, bias=True), weights=((60, 120), (60,)), parameters=7260\n",
      "  (attn_step2): Linear(in_features=60, out_features=1, bias=True), weights=((1, 60), (1,)), parameters=61\n",
      "  (fc1): Linear(in_features=960, out_features=120, bias=True), weights=((120, 960), (120,)), parameters=115320\n",
      "  (fc2): Linear(in_features=120, out_features=2, bias=True), weights=((2, 120), (2,)), parameters=242\n",
      ")\n",
      "====================\n",
      "\n",
      "\n",
      "\u001b[92m2020-09-10 19:58:33\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_19:58:33 -- Epoch: 1/10; Train; loss: 0.356; acc: 0.842; precision: 0.830, recall: 0.860, macrof1: 0.842, weightedf1: 0.842\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=322.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:58:40\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_19:58:40 -- Epoch: 1/10; Valid; loss: 0.254; acc: 0.897; precision: 0.871, recall: 0.932, macrof1: 0.897, weightedf1: 0.897\u001b[0m\n",
      "\u001b[92m2020-09-10 19:58:40\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=1000.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:59:26\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_19:59:26 -- Epoch: 2/10; Train; loss: 0.210; acc: 0.918; precision: 0.908, recall: 0.929, macrof1: 0.918, weightedf1: 0.918\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=322.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 19:59:32\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_19:59:32 -- Epoch: 2/10; Valid; loss: 0.212; acc: 0.920; precision: 0.912, recall: 0.928, macrof1: 0.920, weightedf1: 0.920\u001b[0m\n",
      "\u001b[92m2020-09-10 19:59:32\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=1000.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 20:00:18\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_20:00:18 -- Epoch: 3/10; Train; loss: 0.167; acc: 0.936; precision: 0.928, recall: 0.946, macrof1: 0.936, weightedf1: 0.936\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=322.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 20:00:25\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_20:00:25 -- Epoch: 3/10; Valid; loss: 0.202; acc: 0.925; precision: 0.911, recall: 0.941, macrof1: 0.925, weightedf1: 0.925\u001b[0m\n",
      "\u001b[92m2020-09-10 20:00:25\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=1000.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 20:01:11\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_20:01:11 -- Epoch: 4/10; Train; loss: 0.140; acc: 0.948; precision: 0.940, recall: 0.957, macrof1: 0.948, weightedf1: 0.948\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=322.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 20:01:17\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_20:01:17 -- Epoch: 4/10; Valid; loss: 0.203; acc: 0.922; precision: 0.915, recall: 0.931, macrof1: 0.922, weightedf1: 0.922\u001b[0m\n",
      "\u001b[92m2020-09-10 20:01:17\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=1000.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 20:02:03\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_20:02:03 -- Epoch: 5/10; Train; loss: 0.121; acc: 0.956; precision: 0.950, recall: 0.962, macrof1: 0.956, weightedf1: 0.956\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=322.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 20:02:10\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_20:02:10 -- Epoch: 5/10; Valid; loss: 0.185; acc: 0.934; precision: 0.925, recall: 0.944, macrof1: 0.934, weightedf1: 0.934\u001b[0m\n",
      "\u001b[92m2020-09-10 20:02:10\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=1000.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 20:02:56\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_20:02:56 -- Epoch: 6/10; Train; loss: 0.108; acc: 0.960; precision: 0.954, recall: 0.966, macrof1: 0.960, weightedf1: 0.960\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=322.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 20:03:03\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_20:03:03 -- Epoch: 6/10; Valid; loss: 0.186; acc: 0.935; precision: 0.916, recall: 0.958, macrof1: 0.935, weightedf1: 0.935\u001b[0m\n",
      "\u001b[92m2020-09-10 20:03:03\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=1000.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 20:03:49\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_20:03:49 -- Epoch: 7/10; Train; loss: 0.094; acc: 0.965; precision: 0.960, recall: 0.971, macrof1: 0.965, weightedf1: 0.965\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=322.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 20:03:55\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_20:03:55 -- Epoch: 7/10; Valid; loss: 0.188; acc: 0.935; precision: 0.929, recall: 0.942, macrof1: 0.935, weightedf1: 0.935\u001b[0m\n",
      "\u001b[92m2020-09-10 20:03:55\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=1000.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 20:04:41\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_20:04:41 -- Epoch: 8/10; Train; loss: 0.087; acc: 0.969; precision: 0.964, recall: 0.974, macrof1: 0.969, weightedf1: 0.969\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=322.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 20:04:48\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_20:04:48 -- Epoch: 8/10; Valid; loss: 0.180; acc: 0.938; precision: 0.930, recall: 0.947, macrof1: 0.938, weightedf1: 0.938\u001b[0m\n",
      "\u001b[92m2020-09-10 20:04:48\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=1000.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 20:05:34\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_20:05:34 -- Epoch: 9/10; Train; loss: 0.078; acc: 0.972; precision: 0.968, recall: 0.976, macrof1: 0.972, weightedf1: 0.972\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=322.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 20:05:41\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_20:05:41 -- Epoch: 9/10; Valid; loss: 0.201; acc: 0.935; precision: 0.941, recall: 0.928, macrof1: 0.935, weightedf1: 0.935\u001b[0m\n",
      "\u001b[92m2020-09-10 20:05:41\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=1000.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 20:06:27\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_20:06:27 -- Epoch: 10/10; Train; loss: 0.071; acc: 0.974; precision: 0.971, recall: 0.978, macrof1: 0.974, weightedf1: 0.974\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=322.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 20:06:33\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_20:06:33 -- Epoch: 10/10; Valid; loss: 0.194; acc: 0.936; precision: 0.931, recall: 0.942, macrof1: 0.936, weightedf1: 0.936\u001b[0m\n",
      "\u001b[92m2020-09-10 20:06:33\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n",
      "\n",
      "\u001b[92m2020-09-10 20:06:33\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model with least valid loss (checkpoint: 8) at ./models/wikigaz_en_ft_ocr_rnn_v002_n64000/wikigaz_en_ft_ocr_rnn_v002_n64000.model\u001b[0m\n",
      "\n",
      "\n",
      "\n",
      "====================\n",
      "User time: 526.3548\n",
      "====================\n"
     ]
    }
   ],
   "source": [
    "from DeezyMatch import finetune as dm_finetune\n",
    "\n",
    "n_ft_examples = 64000\n",
    "\n",
    "# fine-tune a pretrained model stored at pretrained_model_path and pretrained_vocab_path \n",
    "dm_finetune(input_file_path=\"./inputs/input_dfm_rnn_model_B_no_early_stopping.yaml\", \n",
    "            dataset_path=\"./dataset/ocr_trainval.txt\", \n",
    "            model_name=f\"wikigaz_en_ft_ocr_rnn_v002_n{n_ft_examples}\",\n",
    "            pretrained_model_path=\"./models/wikigaz_en_rnn_001/wikigaz_en_rnn_001.model\", \n",
    "            pretrained_vocab_path=\"./models/wikigaz_en_rnn_001/wikigaz_en_rnn_001.vocab\",\n",
    "            n_train_examples=n_ft_examples\n",
    "           )"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 74,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 20:06:33\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mread input file: ./inputs/input_dfm_rnn_model_B_no_early_stopping.yaml\u001b[0m\n",
      "\u001b[92m2020-09-10 20:06:33\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32mpytorch will use: cuda:1\u001b[0m\n",
      "\u001b[92m2020-09-10 20:06:33\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mread CSV file: ./dataset/ocr_trainval.txt\u001b[0m\n",
      "\u001b[92m2020-09-10 20:06:34\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32mnumber of labels, True: 42301 and False: 42302\u001b[0m\n",
      "\u001b[92m2020-09-10 20:06:34\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mSplitting the Dataset\u001b[0m\n",
      "\u001b[92m2020-09-10 20:06:34\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mfinish splitting the Dataset. User time: 0.055474281311035156\u001b[0m\n",
      "\u001b[92m2020-09-10 20:06:34\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msplits are as follow:\n",
      "train    84000\n",
      "val        603\n",
      "Name: split, dtype: int64\u001b[0m\n",
      "\u001b[92m2020-09-10 20:06:34\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mstart creating a lookup table and convert characters to indices\u001b[0m\n",
      "\u001b[92m2020-09-10 20:06:34\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32m-- create vocabulary\u001b[0m\n",
      "\u001b[92m2020-09-10 20:06:35\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32m-- convert tokens to indices\u001b[0m\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      "length s1:   0%|          | 0/84000 [00:00<?, ?it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 20:06:36\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mskipping 0 lines\u001b[0m\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "                                                                     \r"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n",
      "============================================================\n",
      "List all parameters in the model\n",
      "============================================================\n",
      "emb.weight False\n",
      "rnn_1.weight_ih_l0 True\n",
      "rnn_1.weight_hh_l0 True\n",
      "rnn_1.bias_ih_l0 True\n",
      "rnn_1.bias_hh_l0 True\n",
      "rnn_1.weight_ih_l0_reverse True\n",
      "rnn_1.weight_hh_l0_reverse True\n",
      "rnn_1.bias_ih_l0_reverse True\n",
      "rnn_1.bias_hh_l0_reverse True\n",
      "rnn_1.weight_ih_l1 True\n",
      "rnn_1.weight_hh_l1 True\n",
      "rnn_1.bias_ih_l1 True\n",
      "rnn_1.bias_hh_l1 True\n",
      "rnn_1.weight_ih_l1_reverse True\n",
      "rnn_1.weight_hh_l1_reverse True\n",
      "rnn_1.bias_ih_l1_reverse True\n",
      "rnn_1.bias_hh_l1_reverse True\n",
      "attn_step1.weight True\n",
      "attn_step1.bias True\n",
      "attn_step2.weight True\n",
      "attn_step2.bias True\n",
      "fc1.weight True\n",
      "fc1.bias True\n",
      "fc2.weight True\n",
      "fc2.bias True\n",
      "============================================================\n",
      "\n",
      "\n",
      "\n",
      "\u001b[92m2020-09-10 20:06:37\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m******************************\u001b[0m\n",
      "\u001b[92m2020-09-10 20:06:37\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m**** (Bi-directional) RNN ****\u001b[0m\n",
      "\u001b[92m2020-09-10 20:06:37\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[95m******************************\u001b[0m\n",
      "\u001b[92m2020-09-10 20:06:37\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mNumber of batches: 1313\u001b[0m\n",
      "\u001b[92m2020-09-10 20:06:37\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[2;32mNumber of epochs: 10\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "f30c3225e1c543948d66b7396660ee40",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=10.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=1313.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n",
      "\n",
      "====================\n",
      "Total number of params: 611883\n",
      "\n",
      "two_parallel_rnns (\n",
      "  (emb): Embedding(7542, 60), weights=((7542, 60),), parameters=452520\n",
      "  (rnn_1): RNN(60, 60, num_layers=2, dropout=0.01, bidirectional=True), weights=((60, 60), (60, 60), (60,), (60,), (60, 60), (60, 60), (60,), (60,), (60, 120), (60, 60), (60,), (60,), (60, 120), (60, 60), (60,), (60,)), parameters=36480\n",
      "  (attn_step1): Linear(in_features=120, out_features=60, bias=True), weights=((60, 120), (60,)), parameters=7260\n",
      "  (attn_step2): Linear(in_features=60, out_features=1, bias=True), weights=((1, 60), (1,)), parameters=61\n",
      "  (fc1): Linear(in_features=960, out_features=120, bias=True), weights=((120, 960), (120,)), parameters=115320\n",
      "  (fc2): Linear(in_features=120, out_features=2, bias=True), weights=((2, 120), (2,)), parameters=242\n",
      ")\n",
      "====================\n",
      "\n",
      "\n",
      "\u001b[92m2020-09-10 20:07:37\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_20:07:37 -- Epoch: 1/10; Train; loss: 0.326; acc: 0.858; precision: 0.845, recall: 0.877, macrof1: 0.858, weightedf1: 0.858\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=10.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 20:07:37\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_20:07:37 -- Epoch: 1/10; Valid; loss: 0.239; acc: 0.887; precision: 0.852, recall: 0.937, macrof1: 0.887, weightedf1: 0.887\u001b[0m\n",
      "\u001b[92m2020-09-10 20:07:37\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=1313.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 20:08:38\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_20:08:38 -- Epoch: 2/10; Train; loss: 0.193; acc: 0.925; precision: 0.916, recall: 0.936, macrof1: 0.925, weightedf1: 0.925\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=10.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 20:08:38\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_20:08:38 -- Epoch: 2/10; Valid; loss: 0.160; acc: 0.930; precision: 0.925, recall: 0.937, macrof1: 0.930, weightedf1: 0.930\u001b[0m\n",
      "\u001b[92m2020-09-10 20:08:38\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=1313.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 20:09:38\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_20:09:38 -- Epoch: 3/10; Train; loss: 0.158; acc: 0.941; precision: 0.932, recall: 0.951, macrof1: 0.940, weightedf1: 0.940\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=10.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 20:09:38\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_20:09:38 -- Epoch: 3/10; Valid; loss: 0.153; acc: 0.927; precision: 0.903, recall: 0.957, macrof1: 0.927, weightedf1: 0.927\u001b[0m\n",
      "\u001b[92m2020-09-10 20:09:38\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=1313.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 20:10:39\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_20:10:39 -- Epoch: 4/10; Train; loss: 0.134; acc: 0.951; precision: 0.944, recall: 0.960, macrof1: 0.951, weightedf1: 0.951\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=10.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 20:10:39\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_20:10:39 -- Epoch: 4/10; Valid; loss: 0.145; acc: 0.940; precision: 0.929, recall: 0.953, macrof1: 0.940, weightedf1: 0.940\u001b[0m\n",
      "\u001b[92m2020-09-10 20:10:39\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=1313.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 20:11:39\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_20:11:39 -- Epoch: 5/10; Train; loss: 0.122; acc: 0.955; precision: 0.949, recall: 0.963, macrof1: 0.955, weightedf1: 0.955\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=10.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 20:11:39\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_20:11:39 -- Epoch: 5/10; Valid; loss: 0.141; acc: 0.940; precision: 0.915, recall: 0.970, macrof1: 0.940, weightedf1: 0.940\u001b[0m\n",
      "\u001b[92m2020-09-10 20:11:39\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=1313.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 20:12:39\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_20:12:39 -- Epoch: 6/10; Train; loss: 0.112; acc: 0.959; precision: 0.953, recall: 0.967, macrof1: 0.959, weightedf1: 0.959\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=10.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 20:12:39\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_20:12:39 -- Epoch: 6/10; Valid; loss: 0.121; acc: 0.947; precision: 0.927, recall: 0.970, macrof1: 0.947, weightedf1: 0.947\u001b[0m\n",
      "\u001b[92m2020-09-10 20:12:39\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=1313.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 20:13:39\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_20:13:39 -- Epoch: 7/10; Train; loss: 0.100; acc: 0.964; precision: 0.958, recall: 0.970, macrof1: 0.964, weightedf1: 0.964\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=10.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 20:13:40\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_20:13:40 -- Epoch: 7/10; Valid; loss: 0.143; acc: 0.952; precision: 0.939, recall: 0.967, macrof1: 0.952, weightedf1: 0.952\u001b[0m\n",
      "\u001b[92m2020-09-10 20:13:40\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=1313.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 20:14:40\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_20:14:40 -- Epoch: 8/10; Train; loss: 0.090; acc: 0.967; precision: 0.961, recall: 0.974, macrof1: 0.967, weightedf1: 0.967\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=10.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 20:14:40\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_20:14:40 -- Epoch: 8/10; Valid; loss: 0.123; acc: 0.949; precision: 0.935, recall: 0.963, macrof1: 0.949, weightedf1: 0.949\u001b[0m\n",
      "\u001b[92m2020-09-10 20:14:40\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=1313.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 20:15:41\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_20:15:41 -- Epoch: 9/10; Train; loss: 0.086; acc: 0.969; precision: 0.964, recall: 0.974, macrof1: 0.969, weightedf1: 0.969\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=10.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 20:15:41\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_20:15:41 -- Epoch: 9/10; Valid; loss: 0.106; acc: 0.957; precision: 0.948, recall: 0.967, macrof1: 0.957, weightedf1: 0.957\u001b[0m\n",
      "\u001b[92m2020-09-10 20:15:41\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=1313.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 20:16:41\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[0;33m09/10/2020_20:16:41 -- Epoch: 10/10; Train; loss: 0.080; acc: 0.971; precision: 0.966, recall: 0.976, macrof1: 0.971, weightedf1: 0.971\u001b[0m\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=10.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[92m2020-09-10 20:16:41\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;31m09/10/2020_20:16:41 -- Epoch: 10/10; Valid; loss: 0.117; acc: 0.957; precision: 0.954, recall: 0.960, macrof1: 0.957, weightedf1: 0.957\u001b[0m\n",
      "\u001b[92m2020-09-10 20:16:41\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model\u001b[0m\n",
      "\n",
      "\u001b[92m2020-09-10 20:16:41\u001b[0m \u001b[95mlwm-embeddings\u001b[0m \u001b[1m\u001b[90m[INFO]\u001b[0m \u001b[1;32msaving the model with least valid loss (checkpoint: 9) at ./models/wikigaz_en_ft_ocr_rnn_v002_n84000/wikigaz_en_ft_ocr_rnn_v002_n84000.model\u001b[0m\n",
      "\n",
      "\n",
      "\n",
      "====================\n",
      "User time: 604.0644\n",
      "====================\n"
     ]
    }
   ],
   "source": [
    "from DeezyMatch import finetune as dm_finetune\n",
    "\n",
    "n_ft_examples = 84000\n",
    "\n",
    "# fine-tune a pretrained model stored at pretrained_model_path and pretrained_vocab_path \n",
    "dm_finetune(input_file_path=\"./inputs/input_dfm_rnn_model_B_no_early_stopping.yaml\", \n",
    "            dataset_path=\"./dataset/ocr_trainval.txt\", \n",
    "            model_name=f\"wikigaz_en_ft_ocr_rnn_v002_n{n_ft_examples}\",\n",
    "            pretrained_model_path=\"./models/wikigaz_en_rnn_001/wikigaz_en_rnn_001.model\", \n",
    "            pretrained_vocab_path=\"./models/wikigaz_en_rnn_001/wikigaz_en_rnn_001.vocab\",\n",
    "            n_train_examples=n_ft_examples\n",
    "           )"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.8"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
